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GENE REGULATION IN TRANSGENIC ANIMALS USING 

A TRANSPOSON-B ASED VECTOR 

The U.S. Government has certain rights in this invention. The development of 
this invention was partially funded by the United States Government under a HATCH 
grant from the United States Department of Agriculture, partially funded by the 
United States Government with Formula 1433 funds from the United States 
Department of Agriculture and partially funded by the United States Government 
under contract DAAD 19-02016 awarded by the Army. 

FIELD OF THE INVENTION 

The present invention relates generally to cell-specific gene regulation in 
transgenic animals. Animals may be made transgenic through administration of a 
transposon-based vector through any method of administration including pronuclear 
injection, or intraembiyonic, intratesticular, intraoviductal or intravenous 
administration. These transgenic animals contain the gene of interest in all cells, 
including germ cells. Animals may also be made transgenic by targeting specific cells 
for uptake and gene incorporation of the transposon-based vectors. Stable 
incorporation of a gene of interest into cells of the transgenic animals is demonstrated 
by expression of the gene of interest in a cell, wherein expression is regulated by a 
promoter sequence. The promoter sequence may be provided as a transgene along 
with the gene of interest or may be endogenous to the cell. The promoter sequence 
may be constitutive or inducible, wherein inducible promoters include tissue-specific 
promoters, developmentally regulated promoters and chemically inducible promoters. 

BACKGROUND OF THE INVENTION 

Transgenic animals are desirable for a variety of reasons, including their 
potential as biological factories to produce desired molecules for pharmaceutical, 
diagnostic and industrial uses. This potential is attractive to the industry due to the 
inadequate capacity in facilities used for recombinant production of desired molecules 
and the increasing demand by the pharmaceutical industry for use of these facilities. 
Numerous attempts to produce transgenic animals have met several problems, 
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including low rates of gene incorporation arid unstable gene incorporation. 
Accordingly, improved gene technologies are needed for the development of 
transgenic animals for the production of desired molecules. 

Improved gene delivery technologies are also needed for the treatment of 
5 disease in animals and humans. Many diseases and conditions can be treated with 
gene-delivery technologies, which provide a gene of interest to a patient suffering 
from the disease or the condition. An example of such disease is Type 1 diabetes. 
Type 1 diabetes is an autoimmune disease that ultimately results in destruction of the 
insulin producing p-cells in the pancreas. Although patients with Type 1 diabetes 

1 0 may be treated adequately with insulin injections or insulin pumps, these therapies are 
only partially effective. Insulin replacement, such as via insulin injection or pump 
administration, cannot fully reverse the defect in the vascular endothelium found in 
the hyperglycemic state (Pieper et al., 1996. Diabetes Res. Clin. Pract. Suppl. S157- 
SI 62), In addition, hyper- and hypoglycemia occurs frequently despite intensive 

15 home blood glucose monitoring. Finally, carefiil dietary constraints are needed to 
maintain an adequate ratio of consumed calories consumed. This often causes major 
psychosocial stress for many diabetic patients. Development of gene therapies 
providing delivery of the insulin gene into the pancreas of diabetic patients could 
overcome many of these problems and result in improved life expectancy and quality 

20 of life. 

Several of the prior art gene delivery technologies employed viruses that are 
associated with potentially undesirable side effects and safety concerns. The majority 
of current gene-delivery technologies useful for gene therapy rely on virus-based 
delivery vectors, such as adeno and adeno-associated viruses, retroviruses, and other 
25 viruses, which have been attenuated to no longer replicate. (Kay, M.A., et al. 2001. 
Nature Medicine 7:33-40). 

There are multiple problems associated with the use or viral vectors. First, 
they are not tissue-specific. In fact, a gene therapy trial using adenovirus was recently 
halted because the vector was present in die patient's sperm (Gene trial to proceed 

30 despite fears fliat flierapy could change child's genetic makeup. The New York 
Times, December 23, 2001). Second, viral vectors are likely to be transiently 
incorporated, which necessitates re-treating a patient at specified time intervals. (Kay, 
M.A., et al. 2001. Nature Medicine 7:33-40), Third, there is a concern that a viral- 
based vector could revert to its virulent form and cause disease. Fourth, viral-based 

35 vectors require a dividing cell for stable integration. Fifth, viral-based vectors 
indiscriminately integrate into various cells and tissues, which can result in 
undesirable gennline integration. Sixth, die required high titers needed to achieve the 
desired effect have resulted in the death of one patient and they are believed to be 
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responsible for induction of cancer in a separate study. (Science, News of the Week, 
October 4, 2002). 

Accordingly, what is needed is a new vector to produce transgenic animals 
and humans with stably incorporated genes, which vector does not cause disease or 
5 other unwanted side effects. There is also a need for DNA constructs that would be 
stably incorporated into the tissues and cells of animals and humans, including cells in 
the resting state, which are not replicating. There is a further recognized need in the 
art for DNA constructs capable of delivering genes to specific tissues and cells of 
animals and humans. 

10 When incorporating a gene of interest into an animal for the production of a 

desired protein or when incorporating a gene of interest in an animal or human for the 
treatment of a disease, it is often desirable to selectively activate incorporated genes 
using inducible promoters. These inducible promoters are regulated by substances 
either produced or recognized by the transcription control elements within the cell in 

15 which the gene is incorporated. In many instances, control of gene expression is 
desired in transgenic animals or humans so that incorporated genes are selectively 
activated at desired times and/or under the influence of specific substances. 
Accordingly, what is needed is a means to selectively activate genes introduced into 
the genome of cells of a transgenic animal or human. This can be taken a step further 

20 to cause incorporation to be tissue-specific, which prevents widespread gene 
incorporation throughout a patient's body (animal or human). This decreases the 
amount of DNA needed for a treatment, decreases the chance of mcorporation in 
gametes, and targets gene delivery, incorporation, and expression to the desired tissue 
where the gene is needed to fimction. 

25 

SUMMARY OF THE INVENTION 

The present invention addresses the problems described above by providing 
new, effective and efficient compositions for producing transgenic animals and for 
treating disease in animals or humans. Transgenic animals include all egg-laying 

30 animals and milk-producing animals. Transgenic animals further include but ate not 
limited to avians, fish, amphibians, reptiles, insects, mammals and humans. In a 
preferred embodiment, the animal is an avian animal. In another preferred 
embodiment, the animal is a milk-producing animal, including but not limited to 
bovine, porcine, ovine and equine animals. Animals are made transgenic through 

35 administration of a composition comprising a transposon-based vector designed for 
stable incorporation of a gene of interest for production of a desired protein, together 
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with an acceptable carrier. A transfection reagent is optionally added to the 
composition before administration. 

The transposon-based vectors of the present invention include a transposase, 
operably-linked to a first promoter, and a coding sequence for a protein or peptide of 
5 interest operably-linked to a second promoter, wherein the coding sequence for the 
protein or peptide of interest and its operably-linked promoter are flanked by 
transposase insertion sequences recognized by the transposase. The transposon-based 
vector also includes the following characteristics: a) one or more modified Kozak 
sequences comprising ACCATG (SEQ ID NO: 13) at the 3' end of the first promoter 

1 0 to enhance expression of the transposase; b) modifications of the codons for the first 
several N-terminal amino acids of the transposase, wherein the nucleotide at the third 
base position of each codon was changed to an A or a T without changing the 
corresponding amino acid; c) addition of one or more stop codons to enhance the 
termination of transposase synthesis; and/or, d) addition of an effective polyA 

15 sequence operably-linked to the transposase to further enhance expression of the 
transposase gene. 

Use of the compositions of the present invention results in highly efficient and 
stable incorporation of a gene of interest into the genome of transfected animals. For 
example, transgenic avians have been mated and produce transgenic progeny in the 

20 Gl generation. The transgenic progeny have been mated and produce transgenic 
progeny m the G2 generation. 

The present invention also provides for tissue-specific incorporation and/or 
expression of a gene of interest. Tissue-specific incorporation of a gene of interest 
may be achieved by placing the transposase gene under the control of a tissue-specific 

25 promoter, whereas tissue-specific expression of a gene of interest may be achieved by 
placing the gene of interest under the control of a tissue-specific promoter. In some 
embodiments, the gene of interest is transcribed under the influence of an ovalbumin, 
or other oviduct specific, promoter. Linking the gene of interest to an oviduct specific 
promoter in an egg-laying animal results in synthesis of a desired molecule and 

30 deposition of the desired molecule in a developing egg. The present invention further 
provides for stable incorporation and expression of genes in the epithelial cells of the 
mammary gland in milk-producing animals. Transcription of the gene of interest in 
the epithelial cells of the mammary gland results in synthesis of a desired molecule 
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and deposition of the desired molecule in the milk. A preferred molecule is a protein. • 
In some embodiments, the desired molecule deposited in the milk is an antiviral 
protein, an antibody, or a serum protein. 

In other embodiments, specific incorporation of the proinsulin gene into liver 
. 5 cells of a diabetic animal results in the improvement of the animal's condition. Such 
improvement is achieved by placing a transposase gene under the control of a liver- 
specific promoter, which drives integration of the gene of interest in liver cells of the 
diabetic animal. 

The present invention advantageously produces a high number of transgenic 
10 animals having a gene of interest stably incorporated. These transgenic animals 
successfully pass the desired gene to their progeny. The transgenic animals of the 
present invention also produce large amounts of a desired molecule encoded by the 
transgene. Transgenic egg-laying animals, particularly avians, produce large amounts 
of a desired protein that is deposited in the egg for rapid harvest and purification. 

1 5 Transgenic milk-producing animals produce large amounts of a desired protein that is 
deposited in the milk for rapid harvest and purification. 

Any desired gene may be incorporated into the novel transposon-based vectors 
of the present invention in order to synthesize a desired molecule in the transgenic 
animals. Proteins, peptides and nucleic acids are preferred desired molecules to be 

20 produced by the transgenic animals of the present invention. Particularly preferred 
proteins are antibody proteins. 

This invention provides a composition useful for the production of transgenic 
hens capable of producing substantially high amounts of a desired protein or peptide. 
Entire flocks of transgenic birds may be developed very quickly in order to produce 

25 industrial amounts of desired molecules. The present invention solves the problems 
inherent in the inadequate capacity of fermentation facilities used for bacterial 
production of molecules and provides a more efficient and economical way to 
produce desired molecules. Accordingly, the present invention provides a means to 
produce large amounts of therapeutic, diagnostic and reagent molecules. 

30 Transgenic chickens are excellent in terms of convenience and efficiency of 

manufacturing molecules such as proteins and peptides. Starting with a single 
transgenic rooster, thousands of transgenic offspring can be produced within a year. 
(In principle, up to forty million offspring could be produced in just three 
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generations). Each transgenic female is expected to lay at least 25a eggs/year, each 
potentially containing hundreds of milligrams of the selected protein. Flocks of 
chickens numbering in the hundreds of thousands are readily handled through 
established commercial systems. The technologies for obtaining eggs and 
5 ftactionating them are also well known and widely accepted. Thus, for each 
therapeutic, diagnostic, or other protein of interest, large amounts of a substantially 
pure material can be produced at relatively low incremental cost. 

A wide range of recombinant peptides and proteins can be produced in 
transgenic egg-laying animals and milk-producing animals. Enzymes, hormones, 
10 antibodies, growth factors, semm proteins, commodity proteins, biological response 
modifiers, peptides and designed proteins may all be made through practice of the 
present invention. For example, rough estimates suggest that it is possible to produce 
in bulk growth hormone, insulin, or Factor VIII, and deposit them in transgenic egg 
whites, for an incremental cost in the order of one dollar per gram. At such prices it 
15 is feasible to consider administering such medical agents by inhalation or even orally, 
instead of through injection. Even if bioavailability rates through these avenues were 
low, the cost of a much higher effective-dose would not be prohibitive. 

In one embodiment, the egg-laying transgenic animal is an avian. The method 
of the present invention may be used in avians including Ratites, Psittaciformes, 
20 Falconiformes, Piciformes, Strigiformes, Passeriformes, Coraciformes, Ralliformes, 
Cuculiformes, Columbiformes, Galliformes, Anseriformes, and Herodiones. 
Preferably, the egg-laying transgenic animal is a poultry bird. More preferably, the 
bird is a chicken, turkey, duck, goose or quail. Another preferred bird is a ratite, such 
as. an emu, an ostrich, a rhea, or a cassowary. Other preferred birds are partridge, 
25 pheasant, kiwi, parrot, parakeet, macaw, falcon, eagle, hawk, pigeon, cockatoo, song 
birds, jay bird, blackbird, finch, warbler, canary, toucan, mynah, or sparrow. 

In another embodiment, the transgenic animal is a milk-producing animal, 
including but not limited to bovine, ovine, porcine, equine, and primate animals. 
Milk-producing animals include but are not limited to cows, goats, horses, pigs, 
30 buffalo, rabbits, non-human primates, and humans. 

Accordingly, it is an object of the present invention to provide novel 
transposon-based vectors. 
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It is another object, of the present .invention to provide novel trahsposon-based • 
vectors that encode for the production of desired proteins or peptides in cells. 

It is an object of the present invention to produce transgenic animals through 
administration of a transposon-based vector. 
5 Another object of the present invention is to produce transgenic animals 

through administration of a transposon-based vector, wherein the transgenic animals 
produce desired proteins or peptides. 

Yet another object of the present invention is to produce transgenic animals 
through administration of a transposon-based vector, wherein the transgenic animals 
10 produce desired proteins or peptides and deposit the proteins or peptides in eggs or 
milk. 

It is a further object of the present invention to produce transgenic animals 
through intraembryonic, intratesticular or intraoviductal administration of a 
transposon-based vector. 
15 It is further an object of the present invention to provide a method to produce 

transgenic animals through administration of a transposon-based vector that are 
capable of producing transgenic progeny. 

Yet another object of the present invention is to provide a method to produce 
transgenic animals through administration of a transposon-based vector that are 
20 capable of producing a desired molecule^ such as a protein, peptide or nucleic acid. 

Another object of the present invention is to provide a method to produce 
transgenic animals through administration of a transposon-based vector, wherein such 
administration results in modulation of endogenous gene expression. 

It is another object of the present invention to provide transposon-vectors 
25 useful for cell- or tissue-specific expression of a gene of interest in an animal or 
human with the purpose of gene therapy. 

It is yet another object of the present invention to provide a method to produce 
transgenic avians through administration of a transposon-based vector that are capable 
of producing proteins, peptides or nucleic acids. 
30 It is another object of the present invention to produce transgenic animals 

through administration of a transposon-based vector encoding an antibody or a 
fragment thereof. 
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. Still another obj:ect of the present invention is to provide a method to produce 
transgenic avians through administration of a transposon-based vector that are capable 
of producing proteins or peptides and depositing these proteins or peptides in the egg. 

Another object of the present invention is to provide transgenic avians that 
5 contain a stably incorporated transgene. 

Still another object of the present invention is to provide eggs containing 
desired proteins or peptides encoded by a transgene incorporated into the transgenic 
avian that produces the egg. 

It is further an object of the present invention to provide a method to produce 
10 transgenic milk-producing animals through administration of a transposon-based 
vector that are capable of producing proteins, peptides or nucleic acids. 

Still another object of the present invention is to provide a method to produce 
transgenic milk-producing animals through administration of a transposon-based 
vector that are capable of producing proteins or peptides and depositing these proteins 
1 5 or peptides in their milk. 

Another object of the present invention is to provide transgenic milk- 
producing animals that contain a stably incorporated transgene. 

Another object of the present invention is to provide transgenic milk- 
producing animals that are capable of producing proteins or peptides and depositing 
20 these proteins or peptides in their milk. 

Yet another object of the present invention is to provide milk containing 
desired molecules encoded by a transgene incorporated into the transgenic milk- 
producing animals that produce the milk. 

Still another object of the present invention is to provide milk containing 
25 desired proteins or peptides encoded by a transgene incorporated into the transgenic 
milk-producing animals that produce the milk. 

A further object of the present invention to provide a method to produce 
transgenic sperm through administration of a transposon-based vector to an animal. 

A further object of the present invention to provide transgenic sperm that 
30 contain a stably incorporated transgene. 

An advantage of the present invention is that transgenic animals are produced 
with higher efficiencies than observed in the prior art. 
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Another advantage of the present invention is that these transgenic animals 
possess high copy numbers of the transgene. 

Another advantage of the present invention is that the transgenic animals 
produce large amounts of desired molecules encoded by the transgene. 
5 Still another advantage of the present invention is that desired molecules are 

produced by the transgenic animals much more efficiently and economically than 
prior art methods, thereby providing a means for large scale production of desired 
molecules, particularly proteins and peptides. 

These and other objects, features and advantages of the present invention will 
10 become apparent after a review of the following detailed description of the disclosed 
embodiments and claims. 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 depicts schematically a transposon-based vector containing a 
1 5 transposase operably linked to a first promoter and a gene of interest operably-linked 
to a second promoter, wherein the gene of interest and its operably-linked promoter 
are flanked by insertion sequences (IS) recognized by the transposase. "Pro" 
designates a promoter. In this and subsequent figures, the size of the actual nucleotide 
sequence is not necessarily proportionate to the box representing that sequence. 

20 

Figure 2 depicts schematically a transposon-based vector for targeting 
deposition of a polypeptide in an egg white wherein Ov pro is the ovalbumin 
promoter, Ov protein is the ovalbumin protein and PolyA is a polyadenylation 
sequence. The TAG sequence includes a spacer, the gp41 hairpin loop firom HIV I 
25 and a protein cleavage site. 

Figure 3 depicts schematically a transposon-based vector for targeting 
deposition of a polypeptide in an egg white wherein Ovo pro is the ovomucoid 
promoter and Ovo SS is the ovomucoid signal sequence. The TAG sequence includes 
30 a spacer, the gp41 hairpin loop from HFV I and a protein cleavage site. 
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Figure 4 depicts schematically a transposon-based vector for targeting 
deposition of a polypeptide in an egg yolk wherein Vit pro is the vitellogenin 
promoter and Vit targ is the vitellogenin targeting sequence. 

■ • ■ 

5 Figure 5 depicts schematically a transposon-based vector for expression of 

antibody heavy and light chains. Prepro indicates a prepro sequence from cecropin 
and pro indicates a pro sequence from cecropin. 

Figure 6 depicts schematically a transposon-based vector for expression of 
10 antibody heavy and light chains. Ent indicates an enterokinase cleavage sequence. 

Figure 7 depicts schematically egg white targeted expression of antibody 
heavy and light chains from one vector in either tail-to-tail (Figure 7A) or tail-to-head 
(Figure 7B) configuration. In the tail-to-tail configuration, the ovalbumin signal 

1 5 sequence adjacent to the gene for the light chain contains on its 3 ' end an enterokinase 
cleavage site (not shown) to allow cleavage of the signal sequence from the light 
chain, and the ovalbumin signal sequence adjacent to the gene for the heavy chain 
contains on its 5* end an enterokinase cleavage site (not shown) to allow cleavage of 
the signal sequence from the heavy chain. In the tail-to-head configuration, the 

20 ovalbumin signal sequence adjacent to the gene for the heavy chain and the light 
chain contains on its 3' end an enterokinase Cleavage site (not shown) to allow 
cleavage of the signal sequence from the heavy or light chain. 

DETAILED DESCRIPTION OF THE INVENTION 

25 The present invention provides a new, effective and efficient method of 

producing transgenic animals, particularly egg-laying animals and milk-producing 
animals, through administration of a composition comprising a transposon-based 
vector designed for stable incorporation of a gene of interest for production of a 
desired molecule. 

30 Definitions 

It is to be understood that as used in the specification and in the claims, "a" or 
"an" can mean one or more, depending upon the context in which it is used. Thus, for 
example, reference to "a cell" can mean that at least one cell can be utilized. 

10 
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The terra "antibody" is used interchangeably with the term "immunoglobulin" 
and is defined herein as a protein synthesized by an animal or a cell of the immune 
system in response to the presence of a foreign substance commonly referred to as an 
"antigen" or an "immunogen". The term antibody includes fragments of antibodies. 
5 Antibodies are characterized by specific affinity to a site on the antigen, wherein the 
site is referred to an "antigenic determinant" or an "epitope". Antigens can be 
naturally occurring or artificially engineered. Artificially engineered antigens include 
but are not limited to small molecules, such as small peptides, attached to haptens 
such as macromolecules, for example proteins, nucleic acids, or polysaccharides. 
10 Artificially designed or engineered variants of naturally occurring antibodies and 
artificially designed or engineered antibodies not occurring in nature are all included 
in the current definition. Such variants include conservatively substituted amino acids 
and other forms of substitution as described in the section concerning proteins and 
polypeptides. 

15 As used herein, the term "egg-laying animal" includes all amniotes such as 

birds, turtles, lizards and monotremes. Monotremes are egg-laying mammals and 
include the platypus and echidna. The term "bird" or "fowl," as used herein, is 
defined as a member of the Aves class of animals which are characterized as warm- 
blooded, egg-laying vertebrates primarily adapted for flying. Avians include, without 

20 limitation, Ratites, Psittaciformes, Falconiformes, Piciformes, Strigiformes, 
Passeriformes, Coraciformes, Ralliformes, Cuculiformes, Columbifonnes, 
Galliformes, Anseriformes, and Herodiones. The term "Ratite," as used herein, is 
defined as a group of flightless, mostly large, running birds comprising several orders 
and including the emus, ostriches, kiwis, and cassowaries. The term "Psittaciformes", 

25 as used herein, includes parrots and refers to a monofamilial order of birds that exhibit 
zygodactylism and have a strong hooked bill. A "parrot" is defined as any member of 
the avian family Psittacidae (the single family of the Psittaciformes), distinguished by 
the short, stout, strongly hooked beak. The term "chicken" as used herein denotes 
chickens used for table egg production, such as egg-type chickens, chickens reared for 

30 public meat consumption, or broilers, and chickens reared for both egg and meat 
production ("dual-purpose" chickens). The term "chicken" also denotes chickens 
produced by primary breeder companies, or chickens that are the parents, 
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. grandparents, great-grandparents, etc. of those chickens reared for public table egg, 
meat, or table egg and meat consumption. 

The term "egg" is defined herein as a large female sex cell enclosed in a 
porous, calcarous or leathery shell, produced by birds and reptiles. The term "ovum" 
5 is defined as a female gamete, and is also known as an egg. Therefore, egg 
production in all animals other than birds and reptiles, as used herein, is defined as the 
production and discharge of an ovum from an ovary, or "ovulation". Accordingly, it 
is to be understood that the term "egg" as used herein is defined as a large female sex 
cell enclosed in a porous, calcarous or leathery shell, when a bird or reptile produces 
10 it, or it is an ovum when it is produced by all other animals. 

The term "milk-producing animal" refers herein to mammals including, but 
not limited to, bovine, ovine, porcine, equine, and primate animals. Milk-producing 
animals include but are not limited to cows, llamas, camels, goats, reindeer, zebu, 
water buffalo, yak, horses, pigs, rabbits, non-human primates, and humans. 
15 The term "gene" is defined herein to include a coding region for a protein, 

peptide or polypeptide. 

The term "vector" is used interchangeably with the terms "construct", "DNA 
construct" and "genetic construct" to denote synthetic nucleotide sequences used for 
manipulation of genetic material, including but not limited to cloning, subcloning, 
20 sequencing, or introduction of exogenous genetic material into cells, tissues or 
organisms, such as birds. It is understood by one skilled in the art that vectors may 
contain synthetic DNA sequences, naturally occurring DNA sequences, or both. The 
vectors of the present invention are transposon-based vectors as described herein. 

When referring to two nucleotide sequences, one being a regulatory sequence, 
25 the term "operabiy-linked" is defined herein to mean that the two sequences are 
associated in a manner that allows the regulatory sequence to affect expression of the 
other nucleotide sequence. It is not required that the operably-linked sequences be 
directly adjacent to one another with no intervening sequence(s). 

The term "regulatory sequence" is defined herein as including promoters, 
30 enhancers and other expression control elements such as polyadenylation sequences, 
matrix attachment sites, insulator regions for expression of multiple genes on a single 
construct, ribosome entry/attachment sites, introns that are able to enhance 
expression, and silencers. 
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. TransposonriSased Vectors . 

While not wanting to be bound by the following statement, it is believed that 
the nature of the DNA construct is an important factor in successfully producing 
transgenic animals. The "standard" types of plasmid and viral vectors that have 
S previously been almost universally used for transgenic work in all species, especially 
avians, have low efficiencies and may constitute a major reason for the low rates of 
transformation previously observed. The DNA (or RNA) constructs previously used 
often do not integrate into the host DNA, or integrate only at low frequencies. Other 
factors may have also played a part, such as poor entry of the vector into target cells. 

10 The present invention provides transposon-based vectors that can be administered to 
an animal that overcome the prior art problems relating to low transgene integration 
frequencies. Two preferred transposon-based vectors of the present invention in 
which a tranposase, gene of interest and other polynucleotide sequences may be 
introduced are termed pTnMCS (SEQ ID NO:36) and pTnMod (SEQ ID N0:1). 

15 The transposon-based vectors of the present invention produce integration 

frequencies an order of magnitude greater than has been achieved with previous 
vectors. More specifically, intratesticular injections performed with a prior art 
transposon-based vector (described in U.S. Patent No. 5,719,055) resulted in 41% 
sperm positive roosters whereas intratesticular injections performed with the novel 

20 transposon-based vectors of the present invention resulted in 77% sperm positive 
roosters. Actual frequencies of integration were estimated by. either or both 
comparative strength of flie PCR signal from the sperm and histological evaluation of 
the testes and sperm by quantitative PCR. 

The transposon-based vectors of the present mvention include a transposase 

25 gene operably-linked to a first promoter, and a coding sequence for a desired protein 
or peptide operably-linked to a second promoter, wherein the coding sequence for the 
desired protein or peptide and its operably-linked promoter are flanked by transposase 
insertion sequences recognized by the transposase. The transposon-based vector also 
includes one or more of the following characteristics: a) one or more modified Kozak 

30 sequences comprising ACCATG (SEQ ID NO: 13) at the 3' end of the first promoter 
to enhance expression of the transposase; b) modifications of the codons for the first 
several N-terminal amino acids of the transposase, wherein the third base of each 
codon was changed to an A or a T without changing the corresponding amino acid; c) 
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addition of one or more stop colons to. enhance the termination of transposase 
synthesis; and, d) addition of an effective polyA sequence operably-linked to the 
transposase to further enhance expression of the transposase gene. Figure 1 shows a 
schematic representation of several components of the transposon-based vector. The 
5 present invention further includes vectors containing more than one gene of interest, 
wherein a second or subsequent gene of interest is operably-linked to the second 
promoter or to a different promoter. It is also to be understood that the transposon- 
based vectors shown in the Figures are representational of the present invention and 
that the order of the vector elements may be different than that shown in the Figures, 
10 that the elements may be present in various orientations, and that the vectors may 
contain additional elements not shown in the Figures. 
Transposases and Insertion Sequences 

In a further embodiment of the present invention, the transposase found in the 
transposase-based vector is an altered target site (ATS) transposase and the insertion 

15 sequences are those recognized by the ATS transposase. However, the transposase 
located in the transposase-based vectors is not limited to a modified ATS transposase 
and can be derived from any transposase. Transposases known in the prior art include 
those found in AC7, Tn5SEQl, Tn9I6, Tn951, Tnl72I, Tn 2410, Tnl681, Tnl, Tn2, 
Tn3, Tn4, Tn5, Tn6, Tn9, TnlO, Tn30, TnlOl, Tn903, TnSOl, TniOOO (y8), Tnl68I, 

20 Tn2901, AC transposons, Mp transposons, Spm transposons, En transposons, Dotted 
transposons, Mu transposons, Ds transposons, dSpm transposons and I transposons. 
According to the present invention, these transposases and their regulatory sequences 
are modified for improved functioning as follows: a) the addition one or more 
modified Kozak sequences comprising ACCATG (SEQ ID N0:13) at the 3' end of 

25 the promoter operably-linked to the transposase; b) a change of the codons for the first 
several amino acids of the transposase, wherein the diird base of each codon was 
changed to an A or a T without changing the corresponding amino acid; c) the 
addition of one or more stop codons to enhance the termination of transposase 
synthesis; and/or, d) the addition of an effective polyA sequence operably-linked to 

30 the transposase to further enhance expression of the transposase gene. 

Although not wanting to be bound by the following statement, it is believed 
that the modifications of the first several N-terminal codons of the transposase gene 
increase transcription of the transposase gene, in part, by increasing strand 
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dissociation. It is preferable that between approximately 1 and 20, more preferably 3 
and 15, and most preferably between 4 and 12 of the first N^terminal codons of the 
transposase are modified such that the third base of each codon is changed to an A or 
a T without changing the encoded amino acid. In one embodiment, the first ten N- 
5 terminal codons of the transposase gene are modified in this manner. It is also 
preferred that the transposase contain mutations that make it less specific for preferred 
insertion sites and thus increases the rate of transgene insertion as discussed in U.S. 
Patent No. 5,719,055. 

In some embodiments, the transposon-based vectors are optimized for 
10 expression in a particular host by changing the methylation patterns of the vector 
DNA. For example, prokaryotic methylation may be reduced by using a methylation 
deficient organism for production of the transposon-based vector. The transposon- 
based vectors may also be methylated to resemble eukaryotic DNA for expression in a 
eukaryotic host. 

15 Transposases and insertion sequences from other analogous eukaryotic 

transposon-based vectors that can also be modified and used are, for example, the 
Drosophila P element derived vectors disclosed in U.S. Patent No. 6,291,243; the 
Drosophila mariner element described in Sherman et al. (1998); or the sleeping beauty 
transposon. See also Hackett et al. (1999); D. Lampe et al., 1999. Proc. Natl. Acad. 

20 Sci. USA, 96:11428-11433; S. Fischer et al., 2001. Proc, Natl. Acad. Sci, USA, 
98:6759-6764; L. Zagoraiou et al, 2001. Proc. Natl. Acad. Sci. USA, 98:11474- 
11478; and D. Berg et al. (Eds.), Mobile DNA, Amer. Soc. Microbiol. (Washington, 
D.C., 1989). However, it should be noted that bacterial transposon-based elements 
are preferred, as there is less likelihood that a eukaryotic transposase in the recipient 

25 species will recognize prokaryotic insertion sequences bracketing the transgene. 

Many transposases recognize different insertion sequences, and therefore, it is 
to be understood that a transposase-based vector will contain insertion sequences 
recognized by the particular transposase also found in the transposase-based vector. 
In a preferred embodiment of the invention, the insertion sequences have been 

30 shortened to about 70 base pairs in length as compared to those found in wild-type 
transposons that typically contain insertion sequences of well over 100 base pairs. 

While the examples provided below incorporate a "cut and insert" TnlO based 
vector that is destroyed following the insertion event, the present invention also 
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encompasses the use of a "rolling replication" type transposon-based vector. Use of a 
rolling replication type transposon allows multiple copies of the transposon/transgene 
to be made from a single transgene construct and the copies inserted. This type of 
transposon-based system thereby provides for insertion of multiple copies of a 
5 transgene into a single genome. A rolling replication type transposon-based vector 
may be preferred when the promoter operably-linked to gene of interest is endogenous 
to the host cell and present in a high copy number or highly expressed. However, use 
of a rolling replication system may require tight control to limit the insertion events to 
non-lethal levels. Tnl, Tn2, Tn3, Tn4, Tn5, Tn9, Tn21, TnSOl, Tn551, Tn951, 
10 Tnl721, Tn2410 and Tn2603 are examples of a rolling replication type transposon, 
although Tn5 could be both a rolling replication and a cut and insert type transposon. 
Stop Codons and PolvA Sequences 

In one embodiment, the transposon-based vector contains two stop codons 
operably-linked to the transposase and/or to the gene of interest. In an alternate 

15 embodiment, one stop codon of UAA or UGA is operably linked to the transposase 
and/or to the gene of interest. As used herein an "effective polyA sequence" refers to 
either a synthetic or non-synthetic sequence that contains multiple and sequential 
nucleotides containing an adenine base (an A polynucleotide string) and that increases 
expression of flie gene to which it is operably-linked. A polyA sequence may be 

20 operably-linked to any gene in the transposon-based vector including, but not limited 
to, a transposase gene and a gene of interest In one embodiment, a polyA sequence 
comprises the polynucleotide sequence provided in SEQ ID NO:28. A preferred 
polyA sequence is optimized for use in the host animal or human. &i one 
embodiment, the polyA sequence is optimized for use in a bird, and more specifically, 

25 a chicken. The chicken optimized poIyA sequence generally contains a minimum of 
60 base pairs, and more preferably between approximately 60 and several hundred 
base pairs, that precede the A polynucleotide string and thereby separate the stop 
codon from the A polynucleotide string. A chicken optimized polyA sequence may 
also have a reduced amount of CT repeats as compared to a synthetic polyA sequence. 

30 In one embodiment of the present invention, the polyA sequence comprises a 
conalbumin polyA sequence as provided in SEQ ID NO:33 and as taken from 
GenBank accession # Y00407, base pairs 10651-1 1058. 
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Promoters and Enhancers 
The first promoter operably-linked to the transposase gene and the second 
promoter operably-linked to the gene of interest can be a constitutive promoter or an 
inducible promoter. Constitutive promoters include, but are not limited to, immediate 
5 early cytomegalovirus (CMV) promoter, herpes simplex virus 1 (HSVl) immediate 
early promoter, SV40 promoter, lysozyme promoter, early and late CMV promoters, 
early and late HSV promoters, P-actin promoter, tubulin promoter, Rous-Sarcoma 
virus (RSV) promoter, and heat-shock protein (HSP) promoter. Inducible promoters 
include tissue-specific promoters, developmentally-regulated promoters and 
10 chemically inducible promoters. Examples of tissue-specific promoters include the 
glucose 6 phosphate (G6P) promoter, vitellogenin promoter, ovalbumin promoter, 
ovomucoid promoter, conalbumin promoter, ovotransferrin promoter, prolactin 
promoter, kidney uromodulin promoter, and placental lactogen promoter. In one 
embodiment, the vitellogenin promoter includes a polynucleotide sequence of SEQ ID 
15 NO: 17. The G6P promoter sequence may be deduced from a rat G6P gene 
untranslated upstream region provided in GenBank accession number U57552.1. 
Examples of developmentally-regulated promoters include the homeobox promoters 
and several hormone induced promoters. Examples of chemically inducible 
promoters include reproductive hormone induced promoters and antibiotic inducible 
20 promoters such as the tetracycline inducible promoter and the zinc-inducible 
metallothionine promoter. 

Other inducible promoter systems include the Lac operator repressor system 
inducible by IPTG (isopropyl beta-D-thiogalactoside) (Cronin, A. et al. 2001. Genes 
and Development, v. 15), ecdysone-based inducible systems (Hoppe, U. C. et al. 
25 2000. Mol. Ther. 1:159-164); estrogen-based inducible systems (Braselmann, S. et al. 
1993. Proc. Natl. Acad. Sci. 90:1657-1661); progesterone-based inducible systems 
using a chimeric regulator, GLVP, which is a hybrid protein consisting of the GAM 
binding domain and the herpes simplex virus transcriptional activation domain, VP16, 
and a truncated form of the human progesterone receptor that retains the ability to 
30 bind ligand and can be turned on by RU486 (Wang, et al. 1994. Proc. Natl. Acad. Sci. 
91:81 80-8 1 84); CID-based inducible systems using chemical inducers of dimerization 
(CIDs) to regulate gene expression, such as a system wherein rapamycin induces 
dimerization of the cellular proteins FKBP12 and FRAP (Belshaw, P, J. et al. 1996. J. 
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•Chem. Biol. 3:731-738; Fan, L et aL 1999. Hum. Gene Then 10:2273-2285; Shariat, 
S.F. et al 2001, Cancer Res. 61:2562-2571; Spencer, DM. 1996. Curr. Biol. 6:839- 
847). Chemical substances that activate the chemically inducible promoters can be 
administered to the animal containing the transgene of interest via any method known 
5 to those of skill in the art. • 

Other examples of cell or tissue-specific and constitutive promoters include 
but are not limited to smooth-muscle SM22 promoter, including chimeric 
SM22aIpha/telokin promoters (Hoggatt A.M. et al, 2002. Circ Res. 91(12): 1151-9); 
ubiquitin C promoter (Biochim Biophys Acta, 2003. Jan. 3;1625(l):52-63); Hs£2 

10 promoter; murine COMP (cartilage oligomeric matrix protein) promoter; eariy B cell- 
specific mb-1 promoter (Sigvardsson M., et al., 2002. Mol. Cell Biol. 22(24):8539- 
51); prostate specific antigen (PSA) promoter (Yoshimura I. et al., 2002, J. Urol. 
1 68(6):2659-64); exorh promoter and pineal expression-promoting element (Asaoka 
Y., et al, 2002. Proc. Natl. Acad. Sci. 99(24): 15456-61); neural and liver ceramidase 

15 gene promoters (Okino N. et al, 2002. Biochem. Biophys. Res. Common. 
299(l):160-6); PSP94 gene promoter/enhancer (Gabril M.Y. et al, 2002. Gene Ther. 
9(23): 1 589-99); promoter of the human FAT/CD36 gene (Kuriki C, et al, 2002. Biol 
Pharm. Bull 25(11): 1476-8); VL30 promoter (Staplin W.R. et al, 2002. Blood 
October 24, 2002); IL-10 promoter (Brenner S., et al, 2002. J. Biol Chem. December 

20 18,2002). 

Examples of avian promoters include, but are not limited to, promoters 
controlling expression of egg white proteins, such as ovalbumin, ovotransferrin 
(conalbumin), ovomucoid, lysozyme, ovomucin, g2 ovoglobulin, g3 ovoglobulin, 
ovoflavoprotein, ovostatin (ovomacroglobin), cystatin, avidin, thiamine-binding 
25 protein, glutamyl aminopeptidase minor glycoprotein 1, minor glycoprotein 2; and 
promoters controlling expression of egg-yolk proteins, such as vitellogenin, very low- 
density lipoproteins, low density lipoprotein, cobalamin-binding protein, riboflavin- 

• • • " • 

binding protein, biotin-binding protein (Awade, 1996. Z. Lebensm. Unteis. Forsch. 
202:1-14). An advantage of using the vitellogenin promoter is that it is active during 
30 the egg-laying stage of an animal's life-cycle, which allows for the production of the 
protein of interest to be temporally connected to the import of the protein of interest 
into the egg yolk when the protein of interest is equipped with an appropriate 
targeting sequence. 
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Liver-specific promoters of the present invention include, but are not limited 
to, the following promoters, vitellogenin promoter, G6P promoter, cholesterol-?- 
alpha-hydroxylase (CYP7A) promoter, phenylalanine hydroxylase (PAH) promoter, 
protein C gene promoter, insulin-like growth factor I (IGF-I) promoter, bilirubin 
5 UDP-glucuronosyltrahsferase promoter, aldolase B promoter, furin promoter, 
metallothioneine promoter, albumin promoter, and insulin promoter. 

Also included in the present invention are promoters that can be used to target 
expression of a protein of interest into the milk of a milk-producing animal including, 
but not limited to, P lactoglobin promoter, whey acidic protein promoter, lactalbumin 
1 0 promoter and casein promoter. 

Promoters associated with cells of the immune system may also be used. 
Acute phase promoters such as interleukin (IL)-l and IL-2 may be employed. 
Promoters for heavy and light chain Ig may also be employed. The promoters of the 
T cell receptor components CD4 and CDS, B cell promoters and the promoters of 
1 5 CR2 (complement receptor type 2) may also be employed. Immune system promoters 
are preferably used when the desired protein is an antibody protein. 

Also included in this invention are modified promoters/enhancers wherein 
elements of a single promoter are duplicated, modified, or otherwise changed. In one 
embodiment, steroid hormone-binding domains of the ovalbumin promoter are moved 
20 fi-om about -6.5 kb to within approximately the first 1000 base pairs of the gene of 
interest Modifying an existing promoter with promoter/enhancer elements not found 
naturally in the promoter, as well as building an entirely synthetic promoter, or 
drawing promoter/enhancer elements from various genes together on a non-natural 
backbone, are all encompassed by the current invention. 
25 Accordingly, it is to be understood that the promoters contained within the 

transposon-based vectors of the present invention may be entire promoter sequences 
or fragments of promoter sequences. For example, in one embodiment, the promoter 
operably linked to a gene of interest is an approximately 900 base pair fragment of a 
chicken ovalbumin promoter (SEQ ID NO:40). The constitutive and inducible 
30 promoters contained within the transposon-based vectors may also be modified by the 
addition of one or more modified Kozak sequences of ACCATG (SEQ ID NO: 13). 

As indicated above, the present invention includes transposon-based vectors 
containing one or more enhancers. These enhancers may or may not be operably- 
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• linked to. their native promoter and may be located, at any distance . from their . 
operably-linked promoter. A promoter operably-linked to an enhancer is referred to 
herein as an "enhanced promoter." The enhancers contained within the transposon- 
based vectors are preferably enhancers found in birds, and more preferably, an 
5 ovalbumin enhancer, but are not limited to these types of enhancers. In one 
embodiment, an approximately 675 base pair enhancer element of an ovalbumin 
promoter is cloned upstream of an ovalbumin promoter with 300 base pairs of spacer 
DNA separating the enhancer and promoter. In one embodiment, the enhancer used 
as a part of the present invention comprises base pairs 1-675 of a Chicken Ovalbumin 

10 enhancer from GenBank accession #882527.1. The polynucleotide sequence of this 
enhancer is provided in SEQ ID NO:37. 

Also included in some of the transposon-based vectors of the present invention 
are cap sites and fragments of cap sites. In one embodiment, approximately 50 base 
pairs of a 5' untranslated region wherein the capsite resides are added on the 3* end of 

15 an enhanced promoter or promoter. An exemplary 5* untranslated region is provided 
in SEQ ID NO:38. A putative cap-site residing in this 5' untranslated region 
preferably comprises the polynucleotide sequence provided in SEQ ID NO:39. 

In one embodiment of the present invention, the first promoter operably-linked 
to the transposase gene is a constitutive promoter and the second promoter operably- 

20 linked to the gene of interest is a tissue-specific promoter. In this embodiment, use of 
the first constitutive promoter allows for constitutive activation of the transposase 
gene and incorporation of the gene of interest into virtually all cell types, including 
the germline of die recipient animal. Although the gene of interest is incorporated 
into the germline generally, the gene of interest is only expressed in a tissue-specific 

25 marmer. It should be noted that cell- or tissue-specific expression as described herein 
does not require a complete absence of expression in cells or tissues other than the 
preferred cell or tissue. Instead, "cell-specific" or "tissue-specific" expression refers 
to a majority of the expression of a particular gene of interest in the preferred cell or 
tissue, respectively. 

30 When incorporation of the gene of interest into the germline is not preferred, 

the first promoter operably-linked to the transposase gene can be a tissue-specific 
promoter. For example, transfection of a transposon-based vector containing a 
transposase gene operably-linked to a liver-specific promoter such as the G6P 

20 
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promoter or vitellogenin promoter provides for activation of the transposase gene and 
incorporation of the gene of interest in the cells of the liver but not into the germline 
and other cells generally. In this second embodiment, the second promoter operably- 
linked to the gene of interest can be a constitutive promoter or an inducible promoter. 
5 In a preferred embodiment, both the first promoter and the second promoter are a G6P 
promoter. In embodiments wherein tissue-specific expression or incorporation is 
desired, it is preferred that the transposon-based vector is administered directly to the 
tissue of interest or to an artery leading to the tissue of interest. 

Accordingly, cell specific promoters may be used to enhance transcription in 
10 selected tissues. In birds, for example, promoters that are found in cells of the 
fallopian tube, such as ovalbumin, conalbumin, ovomucoid and/or lysozyme, are used 
in the vectors to ensure transcription of the gene of interest in the epithelial cells and 
tubular gland cells of the fallopian tube, leading to synthesis of the desired protein 
encoded by the gene and deposition into the egg white. In mammals, promoters 
IS specific for the epithelial cells of the alveoli of the mammary gland, such as prolactin, 
insulin, beta lactoglobin, whey acidic protein, lactalbumin, casein, and/or placental 
lactogen, are used in the design of vectors used for transfection of these cells for the 
production of desired proteins for deposition into the milk. In liver cells, the G6P 
promoter may be employed to drive transcription of the gene of interest for protein 
20 production. Proteins made in the liver of birds may be delivered to the egg yolk. 

In order to achieve higher or more efficient expression of the transposase 
gene, the promoter and other regulatory sequences operably-linked to the transposase 
gene may be those derived from the host. These host specific regulatory sequences 
can be tissue specific as described above or can be of a constitutive nature. For 
25 example, an avian actin promoter and its associated polyA sequence can be operably- 
linked to a transposase in a transposase-based vector for transfection into an avian. 
Examples of other host specific promoters that could be operably-linked to the 
transposase include the myosin and DNA or KNA polymerase promoters. 

Directing Sequences 

30 In some embodiments of the present invention, the gene of interest is 

operably-linked to a directing sequence or a sequence that provides proper 
conformation to the desired protein encoded by the gene of interest. As used herein, 
the term "directing sequence" refers to both signal sequences and targeting sequences. 
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An egg directing sequence includes, but is. not limited to, an ovomucoid signal 
sequence, an ovalbumin signal sequence and a vitellogenin targeting sequence. The 
term "signal sequence" refers to an amino acid sequence, or the polynucleotide 
sequence that encodes the amino acid sequence, that directs the protein to which it is 
5 linked to the endoplasmic reticulum in a eukaryote, and more preferably the 
translocational pores in the endoplasmic reticulum, or the plasma membrane in a 
prokatyote, or mitochondria, such us for the purpose of gene therapy of mitochondrial 
diseases. Signal and targeting sequences can be used to direct a desired protein into, 
for example, the milk, when the transposon-based vectors are administered to a milk- 

1 0 producing animal. 

Signal sequences can also be used to direct a desired protein into, for example, 
a secretory pathway for incoiporation into the egg yolk or the egg white, when the 
transposon-based vectors are administered to a bird or other egg-laying animal. One 
example of such a transposon-based vector is provided in Figure 3 wherein the gene 

15 of interest is operably linked to the ovomucoid signal sequence. The present 
invention also includes a gene of interest operably-linked to a second gene containing 
a signal sequence. An example of such an embodiment is shown in Figure 2 wherein 
the gene of interest is operably-linked to the ovalbumin gene that contains an 
ovalbumin signal sequence. Other signal sequences that can be included in the 

20 transposon-based vectors include, but are not limited to the ovotransferrin and 
lysozyme signal sequences. 

As also used herein, the term '^targeting sequence" refers to an amino acid 
sequence, or the polynucleotide sequence encoding the amino acid sequence, which 
amino acid sequence is recognized by a receptor located on the exterior of a cell. 

25 Binding of the receptor to the targeting sequence results in uptake of the protein or 
peptide operably-linked to the targeting sequence by the cell. One example of a 
targeting sequence is a vitellogenin targeting sequence that is recognized by a 
vitellogenin receptor (or the low density lipoprotein receptor) on the exterior of an 
oocyte. In one embodiment, the vitellogenin targeting sequence includes the 

30 polynucleotide sequence of SEQ ID NO: 1 8. In another embodiment, the vitellogenin 
targeting sequence includes all or part of the vitellogenin gene. Other targeting 
sequences include VLDL and Apo E, which are also capable of binding the 
vitellogenin receptor. Since the ApoE protein is not endogenously expressed in birds, 
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its presence may be used advantageously to identify birds carrying the transposon-- 
based vectors of the present invention. 

Genes of Interest Encoding Desired Proteins 

A gene of interest selected for stable incorporation is designed to encode any 
5 desired protein or peptide or to regulate any cellular response. In some embodiments, 
the desired proteins or peptides are deposited in an egg or in milk. It is to be 
understood that the present invention encompasses transposon-based vectors 
containing multiple genes of interest. The multiple genes of interest may each be 
operably-linked to a separate promoter and other regulatory sequence(s) or may all be 

10 operably-linked to the same promoter and other regulatory sequences(s). In one 
embodiment, multiple gene of interest are linked to a single promoter and other 
regulatory sequence(s) and each gene of interest is separated by a cleavage site or a 
pro portion of a signal sequence. 

Protein and peptide hormones are a preferred class of proteins in the present 

15 invention. Such protein and peptide hormones are synthesized throughout the 
endocrine system and include, but are not limited to, hypothalamic hormones and 
hypophysiotropic hormones, anterior, intermediate and posterior pituitary hormones, 
pancreatic islet hormones, hormones made in the gastrointestinal system, renal 
hormones, thymic hormones, parathyroid hormones, adrenal cortical and medullary 

20 hormones. Specifically, hormones that can be produced using the present invention 
include, but are not limited to, chorionic gonadotropin, corticotropin, erythropoietin, 
glucagons, IGF-1, oxytocin, platelet-derived growth factor, calcitonin, follicle- 
stimulating hormone, leutinizing hormone, thyroid-stimulating hormone, insulin, 
gonadotropin-releasing hormone and its analogs, vasopressin, octreotide, 

25 somatostatin, prolactin, adrenocorticotropic hormone, antidiuretic hormone, 
thyrotropin-releasing hormone (TRH), growth hormone-releasing hormone (GHRH), 
dopamine, melatonin, thyroxin (T4), parathyroid hormone (PTH), glucocorticoids 
such as Cortisol, mineralocorticoids such as aldosterone, androgens such as 
testosterone, adrenaline (epinephrine), noradrenaline (norepinephrine), estrogens such 

30 as estradiol, progesterone, glucagons, calcitrol, calciferol, atrial-natriuretic peptide, 
gastrin, secretin, cholecystokinin (CCK), neuropeptide Y, ghrelin, PYY3.361 
angiotensinogen, thrombopoietin, and leptin. By using appropriate polynucleotide 
sequences, species-specific hormones may be made by transgenic animals. 
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In one embodiment of the present invention, the gene of interest is a prohisulin 
gene and the desired molecule is insulin. Proinsulin consists of three parts: a C- 
peptide and two long strands of amino acids (called the alpha and beta chains) that 
later become linked together to form the insulin molecule. Figures 2 and 3 are 
5 schematics of transposon-based vector constructs containing a proinsulin gene 
operably-linked to an ovalbumin promoter and ovalbumin protein or an ovomucoid 
promoter and ovomucoid signal sequence, respectively. In these embodiments, 
proinsulin is expressed in the oviduct tubular gland cells and then deposited in the egg 
white. One example of a proinsulin polynucleotide sequence is shown in SEQ ID 
10 N0:21, wherein the C-peptide cleavage site spans from Arg at position 31 to Arg at 
position 65. 

Serum proteins including lipoproteins such as high density lipoprotein (HDL), 
HDL-Milano and low density lipoprotein, albumin, clotting cascade factors, factor 
VIII, factor IX, fibrinogen, and globulins are also included in the group of desired 

15 proteins of the present invention. Immunoglobulins are one class of desired globulin 
molecules and include but are not limited to IgG, IgM, IgA, IgD, IgE, IgY, lambda 
chains, kappa chains and fragments thereof; Fc fragments, and Fab fragments. 
Desired antibodies include, but are not limited to, naturally occurring antibodies, 
human antibodies, humanized antibodies, and hybrid antibodies. Genes encoding 

20 modified versions of naturally occurring antibodies or fragments thereof and genes 
encoding artificially designed antibodies or fragments thereof may be incorporated 
into the transposon-based vectors of the present invention. Desired antibodies also 
include antibodies with the ability to bind specific ligands, for example, antibodies 
against proteins associated with cancer-related molecules, such as anti-her 2, or anti- 

25 CA125. Accordingly, the present invention encompasses a transposon-based vector 
containing one or more genes encoding a heavy inununoglobulin (Ig) chain and a 
light Ig chain. Furdier, more than one gene encoding for more duin one antibody may 
be administered in one or more transposon-based vectors of the present invention. In 
this manner, an egg may contain more than one type of antibody in the egg white, the 

30 egg yolk or both. 

In one embodiment, a transposon-based vector contains a heavy Ig chain and a 
light Ig chain, both operably linked to a promoter. Figures 5 and 6 schematically 
depict exemplary constructs of this embodiment. More specifically. Figure 5 shows a 
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construct containing a cecropin pre-pro sequence and a cecropin pro sequence, 
wherein the pre sequence functions to direct the resultant protein into the endoplasmic 
reticuluni and the pro sequences and the pro sequences are cleaved upon secretion of 
the protein from a cell into which the construct has been transfected. Figure 6 shows 
5 a construct containing an enterokinase cleavage site. In this embodiment, it may be 
required to further remove several additional amino acids from the light chain 
following cleavage by enterokinase. In another embodiment, the transposon-based 
vector comprises a heavy Ig chain operably-linked to one promoter and a light Ig 
chain operably-linked to another promoter. Figure 7 schematically depicts an 

10 exemplary construct of this embodiment. The present invention also encompasses a 
transposon-based vector containing genes encoding portions of a heavy Ig chain 
and/or portions of a light Ig chain. The present invention further includes a 
transposon-based vector containing a gene that encodes a fusion protein comprising a 
heavy and/or light Ig chain, or portions thereof. 

15 Antibodies used as therapeutic reagents include but are not limited to 

antibodies for use in cancer immunotherapy against specific antigens, or for providing 
passive immunity to an animal or a human against an infectious disease or a toxic 
agent. Antibodies used as diagnostic reagents include, but are not limited to 
antibodies that may be labeled and detected with a detector, for example antibodies 

20 . with a fluorescent label attached that may be detected following exposure to specific 
wavelengths. Such labeled antibodies may be primary antibodies directed to a 
specific antigen, for example, rhodamine-labeled rabbit anti-growth hormone, or may 
be labeled secondary antibodies, such as fluorescein-labeled goat-anti chicken IgG. 
Such labeled antibodies are known to one of ordinary skill in the art. Labels useful 

25 for attachment to antibodies are also known to one of ordinary skill in the art. Some of 
these labels are described in the ''Handbook of Fluorescent Probes and Research 
Products", ninth edition, Richard P. Haugland (ed) Molecular Probes, Inc. Eugene, 
OR), which is incorporated herein in its entirety. 

Antibodies produced with using the present invention may be used as 

30 laboratory reagents for numerous applications including radioinmiunoassay, western 
blots, dot blots, ELISA, immunoaffinity columns and other procedures requiring 
antibodies as known to one of ordinary skill in the art. Such antibodies include 
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primary antibodies, secondary antibodies and tertiary antibodies, which may be 
labeled or unlabeled. 

Antibodies that may be made with the practice of the present invention 
include, but are not limited to primary antibodies, secondary antibodies, designer 
5 antibodies, ahti-protein antibodies, anti-peptide antibodies, anti-DNA antibodies, anti- 
RNA antibodies, anti-hormone antibodies, anti-hypophysiotropic peptides, antibodies 
against non-natural antigens, anti-anterior pituitary hormone antibodies, anti-posterior 
pituitary hormone antibodies, anti-venom antibodies, anti-tumor marker antibodies, 
antibodies directed against epitopes associated with infectious disease, including, anti- 
10 viral, anti-bacterial, anti-protozoal, anti-fimgal, anti-parasitic, anti-receptor, anti-lipid, 
anti-phospholipid, anti-growth factor, anti-cytokine, anti-monokine, anti-idiotype, and 
anti-accessory (presentation) protein antibodies. Antibodies made with the present 
invention, as well as light chains or heavy chains, may also be used to inhibit enzyme 
activity. 

15 Antibodies that may be produced using the present invention include, but are 

not limited to, antibodies made against the following proteins: Bovine y-Globulin, 
Serum; Bovine IgG, Plasma; Chicken y-Globulin, Serum; Human y-Globulin, Serum; 
Human IgA, Plasma; Human IgAj, Myeloma; Human IgAi, Myeloma; Human IgAi, 
Plasma; Human IgD, Plasma; Human Ig£, Myeloma; Human IgG, Plasma; Human 

20 IgG, Fab Fragment, Plasma; Human IgG, F(ab02 Fragment, Plasma; Human IgG, Fc 
Fragment, Plasma; Human IgGi, Myeloma; Human IgG2, Myeloma; Human IgGs, 
Myeloma; Human IgG4, Myeloma; Human IgM, Myeloma; Human IgM, Plasma; 
Human Immunoglobulin, Light Chain k. Urine; Human Inununoglobulin, Light 
Chains k and X, Plasma; Mouse y-Globulin, Serum; Mouse IgG, Serum; Mouse IgM, 

25 Myeloma; Rabbit y-Globulin, Serum; Rabbit IgG, Plasma; and Rat y-Globulin, 
Serum. In one embodiment, the transposon-based vector comprises the coding 
sequence of light and heavy chains of a murine monoclonal antibody that shows 
specificity for human seminoprotein (GenBank Accession numbers AY129006 and 
AY 129304 for the light and heavy chains, respectively). 

30 A further non-limiting list of antibodies that recognize other antibodies is as 

follows: Anti-Chicken IgG, heavy (H) & light (L) Chain Specific (Sheep); Anti-Goat 
y-Globulin (Donkey); Anti-Goat IgG, Fc Fragment Specific (Rabbit); Anti-Guinea Pig 
y-Globulin (Goat); Anti-Human Ig, Light Chain, Type k Specific; Anti-Human Ig, 
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Light Chain, Type X., Specific; Anti-Human IgA, a'-Chain Specific (Goat); Anti- 
Human IgA, Fab Fragment Specific; Anti-Human IgA, Fc Fragment Specific; Anti- 
Human IgA, Secretory; Anti-Human IgE, e-Chain Specific (Goat); Anti-Human IgE, 
Fc Fragment Specific; Anti-Human IgG, Fc Fragment Specific (Goat); Anti-Human 
5 IgG, Y-Chain Specific (Goat); Anti-Human IgG, Fc Fragment Specific; Anti-Human 
IgG, Fd Fragment Specific; Anti-Human IgG, H & L Chain Specific (Goat); Anti- 
Human IgGi, Fc Fragment Specific; Anti-Human IgGz, Fc Fragment Specific; Anti- 
Human IgG2, Fd Fragment Specific; Anti-Human IgG3, Hinge Specific; Anti-Human 
IgG4, Fc Fragment Specific; Anti-Human IgM, Fc Fragment Specific; Anti-Human 

10 IgM, n-Chain Specific; Anti-Mouse IgE, e-Chain Specific; Anti-Mouse y-Globulin 
(Goat); Anti-Mouse IgG, y-Chain Specific (Goat); Anti-Mouse IgG, y-Chain Specific 
(Goat) F(ab')2 Fragment; Anti-Mouse IgG, H & L Chain Specific (Goat); Anti-Mouse 
IgM, [i-Chain Specific (Goat); Anti-Mouse IgM, H & L Chain Specific (Goat); Anti- 
Rabbit y-Globulin (Goat); Anti-Rabbit IgG, Fc Fragment Specific (Goat); Anti-Rabbit 

15 IgG, H & L Chain Specific (Goat); Anti-Rat y-Globulin (Goat); Anti-Rat IgG. H & L 
Chain Specific; Anti-Rhesus Monkey y-Globulin (Goat); and, Anti-Sheep IgG, H & L 
Chain Specific. 

Another non-limiting list of the antibodies that may be produced using the 
present invention is provided in product catalogs of companies such as Phoenix 

20 Pharmaceuticals, Inc. (www.phoenixpeptide.com; 530 Harbor Boulevard, Belmont, 
CA), Peninsula Labs San Carlos CA, SIGMA, StLouis, MO www.sigma- 
aldrich.com, Cappel ICN, Irvine, California, www.icnbiomed.com, and Calbiochem, 
La Jolla, California, www.calbiochem.com, which are all incorporated herein by 
reference in their entirety. The polynucleotide sequences encoding these antibodies 

25 may be obtained from the scientific literature, from patents, and fi^om databases such 
as GenBank. Alternatively, one of ordinary skill in the art may design the 
polynucleotide sequence to be incorporated into the genome by choosing the codons 
that encode for each amino acid in the desired antibody. Antibodies made by the 
transgenic animals of the present invention include antibodies that may be used as 

30 therapeutic reagents, for example in cancer immunotherapy against specific antigens, 
as diagnostic reagents and as laboratory reagents for numerous applications including 
immunoneutralization, radioimmunoassay, western blots, dot blots, ELISA, 
immunoprecipitation and immunoaffinity columns. Some of these antibodies include, 
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but are hot limited to, antibodies which bind the following ligands: adrenomedulin, 
amylin, calcitonin, amyloid, calcitonin gene-related peptide^ cholecystokinin, gastrin, 
gastric inhibitory peptide, gastrin releasing peptide, interleukin, interferon, cortistatin, 
somatostatin, endothelin, sarafotoxin, glucagon, glucagon-like peptide, insulin, atrial 
5 natriuretic peptide, BNP, CNP, neurokinin, substance P, leptin, neuropeptide Y, 
melanin concentrating hormone, melanocyte stimulating hormone, orphanin, 
endorphin, dynorphin, enkephalin, enkephalin, leumoiphin, peptide F, PACAP, 
PACAP-related peptide, parathyroid hormone, urocortin, corticotrophin releasing 
hormone, PHM, PHI, vasoactive intestinal polypeptide, secretin, ACTH, angiotensin, 

10 angiostatin, bombesin, endostatin, bradykinin, FMRF amide, galanin, gonadotropin 
releasing hormone (GnRH) associated peptide, GnRH, growth hormone releasing 
hormone, inhibin, granulocyte-macrophage colony stimulating factor (GM-CSF), 
motilin, neurotensin, oxytocin, vasopressin, osteocalcin, pancreastatin> pancreatic 
polypeptide, peptide YY, proopiomelanocortin, transforming growth factor, vascular 

15 endothelial growth factor, vesicular monoamine transporter, vesicular acetylcholine 
transporter, ghrelin, NPW, NPB, C3d, prokinetican, thyroid stimulating hormone, 
luteinizing hormone, follicle stimulating hormone, prolactin, growth hormone, beta- 
lipotropin, melatonin, kallikriens, kinins, prostaglandins, erythropoietin, pl46 (S£Q 
ED NO: 18 amino acid sequence, SEQ ID NO: 19, nucleotide sequence), estrogen, 

20 testosterone, corticosteroids, mineralocorticoids, thyroid . hormone, thymic hormones, 
connective tissue proteins, nuclear proteins, actin, avidin, activin, agrin, albumin, and 
prohormones, propeptides, splice variants, fragments and analogs thereof. 

The following is yet another non-limiting of antibodies that can be produced 
by the methods of present invention: abciximab (ReoPro), abciximab anti-platelet 

25 aggregation monoclonal antibody, anti-CDlla (hull24), anti-CD18 antibody, anti- 
CD20 antibody, anti-cytomegalovirus (CMV) antibody, anti-digoxin antibody, anti- 
hepatitis B antibody, anti-HER-2 antibody, anti-idiotype antibody to GD3 glycolipid, 
anti-IgE antibody, anti-IL-2R antibody, antimetastatic cancer antibody (mAb 17-1 A), 
anti-rabies antibody, anti-respiratory syncytial vims (RSV) antibody, anti-Rh 

30 antibody, anti-TCR, anti-TNF antibody, anti-VEGF antibody and fab fragment 
thereof, rattlesnake venom antibody, black widow spider venom antibody, coral snake 
venom antibody, antibody against very late antigen-4 (VLA-4), C225 humanized 
antibody to EGF receptor, chimeric (human & mouse) antibody against TNFa, 
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antibody directed -against GPHb/IIIa receptor on human platelets, gamma globulin, 
anti-hepatitis B immunoglobulin, human anti-D immunoglobulin, human antibodies 
against S aureus, human tetanus immunoglobulin, humanized antibody against the 
epidermal growth receptor-2, humanized antibody against the a subunit of the 
5 interleulcin-2 receptor, humanized antibody CTLA4IG, humanized antibody to the 
IL-2 R a-chain, humanized anti-CD40-ligand monoclonal antibody (5c 8), humanized 
mAb against the epidermal growth Teceptor-2, humanized mAb to rous sarcoma virus, 
humanized recombinant antibody (IgGIk) against respiratory syncytial virus (RSV), 
lymphocyte immunoglobulin (anti-thymocyte antibody), lymphocyte 

10 immunoglobulin, mAb against factor Vll, MDX-210 bi-specific antibody against 
HER-2, MDX-22, MDX-220 bi-specific antibody against TAG-72 on tumors, MDX- 
33 antibody to FcyRl receptor, MDX-447 bi-specific antibody against EGF receptor, 
MDX-447 bispecific humanized antibody to EGF receptor, MDX-RA immunotoxin 
(ricin A linked) antibody, Medi-507 antibody (humanized form of BTI-322) against 

15 CD2 receptor on T-cells, monoclonal antibody LDP-02, muromonab-CD3(OKT3) 
antibody. 0KT3 ("muromomab-CD3") antibody, PRO 542 antibody, ReoPro 
("abciximab") antibody, and TNF-IgG fusion protein. 

The antibodies prepared using the methods of the present invention may also 
be designed to possess specific labels that may be detected through means known to 

20 one of ordinary skill in the art. The antibodies may also be designed to possess 
specific sequences useful for purification through means known to one of ordinary 
skill in the art. Specialty antibodies designed for binding specific antigens may also 
be made in transgenic animals using the transposon-based vectors of the present 
invention. 

25 Production of a monoclonal antibody using the transposon-based vectors of 

the present invention can be accomplished in a variety of ways. In one embodiment, 
two vectors may be constructed: one that encodes the light chain, and a second vector 
that encodes the heavy chain of the monoclonal antibody. These vectors may then be 
incorporated into the genome of the target animal by methods disclosed herein. In an 

30 alternative embodiment, the sequences encoding light and heavy chains of a 
monoclonal antibody may be included on a single DNA construct. For example, the 
coding sequence of light and heavy chains of a murine monoclonal antibody that 
show specificity for human seminoprotein can be expressed using transposon-based 
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•constructs of the present invention (Gei^ank Accession numbers AYl 29006 and 
AY 129304 for the light and heavy chains, respectively). 

further included in the present invention are proteins and peptides synthesized 
by the immune system including those synthesized by the thymus, lymph nodes, 
5 spleen, and the gastrointestinal associated lymph tissues (GALT) system. The 
immune system proteins and peptides proteins that can be made in transgenic animals 
using the transposon-based vectors of the present invention include, but are not 
limited to, alpha-interferon, beta-interferon, gamma-interferon, alpha-interferon A, 
alpha-interferon I, G-CSF, GM-CSF, interlukin-1 GL-1), IL-2, IL-3, IL-4, IL-5, IL-6, 
10 IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, TNF-a, and TNF-p. Other cytokines 
included in the present invention include cardiotrophin, stromal cell derived factor, 
macrophage derived chemokine (MDC), melanoma growth stimulatory activity 
(MGSA), macrophage inflammatory proteins 1 alpha (MIP-1 alpha), 2, 3 alpha, 3 
beta, 4 and 5. 

15 Lytic peptides such as pl46 are also included in the desired molecules of the 

present invention. In one embodiment, the pi 46 peptide comprises an amino acid 
sequence of SEQ ID NO: 19. The present invention also encompasses a transposon- 
based vector comprising a pl46 nucleic acid comprising a polynucleotide sequence of 
SEQ ID NO:20, 

20 Enzymes are another class of proteins that may be made through the use of the 

transposon-based vectors of the present invention. Such enzymes include but are not 
limited to adenosine deaminase, alpha-galactosidase, cellulase, coUagenase, dnasel, 
hyaluronidase, lactase, L-asparaginase, pancreatin, papain, streptokinase B, subtilisin, 
superoxide dismutase, thrombin, trypsin, urokinase, fibrinolysin, glucocerebrosidase 

25 and plasminogen activator. In some embodiments wherein the enzyme could have 
deleterious effects, additional amino acids and a protease cleavage site are added to 
the carboxy end of the enzyme of interest in order to prevent expression of a 
functional enzyme. Subsequent digestion of the etizyme with a protease results in 
activation of the enzyme. 

30 Extracellular matrix proteins are one class of desired proteins that may be 

made through the use of the present invention. Examples include but are not limited 
to collagen, fibrin, elastin, laminin, and fibronectin and subtypes thereof. Intracellular 
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proteins and structural proteins are other classes of desired proteins in the present 
invention. 

Growth factors are another desired class of proteins that may be made through 
the use of the present invention and include, but are not limited to, transforming 
5 growth factor-a ("TGF-a"), transforming grovs^h factor-p (TGF-P), platelet-derived 
growth factors (PDGF), fibroblast growth factors (FGF), including FGF acidic 
isoforms 1 and 2, FGF basic form 2 and FGF 4, 8, 9 and 10, nerve growth factors 
(NGF) including NGF 2.5s, NGF 7.0s and beta NGF and neurotrophins, brain derived 
neurotrophic factor, cartilage derived factor, growth factors for stimulation of the 

10 production of red blood cells, growth factors for stimulation of the production of 
white blood cells, bone growth factors (BGF), basic fibroblast growth factor, vascular 
endothelial growth factor (VEGF), granulocyte colony stimulating factor (G-CSF), 
insulin like growth factor (IGF) I and II, hepatocyte growth factor, glial neurotrophic 
growth factor (GDNF), stem cell factor (SCF), keratinocyte growth factor (KGF), 

15 transforming growth factors (TGF), including TGFs alpha, beta, betal, beta2, beta3, 
skeletal growth factor, bone matrix derived growth factors, bone derived growth 
factors, erythropoietin (EPO) and mixtures thereof. 

Another desired class of proteins that may be made may be made through the 
use of the present invention include but are not limited to leptin, leukemia inhibitoiy 

20 factor (LIF), tumor necrosis factor alpha and beta, ENBREL, angiostatin, endostatin, 
fhrombospondin, osteogenic protein*!, bone morphogenetic proteins 2 and 7, 
osteonectin, somatomedin-like peptide, and osteocalcin. 

A non-limiting list of the peptides and proteins that may be made may be 
made through the use of the present invention is provided in product catalogs of 

25 companies such as Phoenix Pharmaceuticals, Inc. (www.phoenixpeptide.com; 530 
Harbor Boulevard • Belmont, CA), Peninsula Labs San Carios CA, SIGMA, StLouis, 
MO www.sigma-aldrich.com, Cappel ICN, Irvine, Califomia, www.icnbiomed.com, 
and Calbiochem, La JoUa, Califomia, www.calbiochem.com. The polynucleotide 
sequences encoding these proteins and peptides of interest may be obtained from the 

30 scientific literature, from patents, and from databases such as GenBank. 
Alternatively, one of ordinary skill in the art may design the polynucleotide sequence 
to be incorporated into the genome by choosing the codons that encode for each 
amino acid in the desired protein or peptide. 
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Some of these desired proteins or peptides that may be made through the use 
of the present invention include but are not limited to the following: adrenomedulin, 
amylin, calcitonin, amyloid, calcitonin gene-related peptide, cholecystokinin, gastrin, 
gastric inhibitory peptide, gastrin releasing peptide, interleukin, interferon, cortistatin, 
5 somatostatin, endothelin, sarafotoxin, glucagon, glucagon-like peptide, insulin, atrial 
natriuretic peptide, BNP, CNP, neurokinin, substance P, leptin, neuropeptide Y, 
melanin concentrating hormone, melanocyte stimulating hormone, orphanin, 
endorphin, dynorphin, enkephalin, leumorphin, peptide F, PACAP, PACAP-related 
peptide, parathyroid hormone, urocortin, corticotrophin releasing hormone, PHM, 

10 PHI, vasoactive intestinal polypeptide, secretin, ACTH, angiotensin, angiostatin, 
bombesin, endostatin, bradykinin, FMRF amide, galanin, gonadotropin releasing 
hormone (GnRH) associated peptide, GnRH, growth hormone releasing hormone, 
inhibin, granulocyte-macrophage colony stimulating factor (GM-CSF), motilin, 
neurotensin, oxytocin, vasopressin, osteocalcin, pancreastatin, pancreatic polypeptide, 

15 peptide YY, proopiomelanocortin, transforming growth factor, vascular endothelial 
growth factor, vesicular monoamine transporter, vesicular acetylcholine transporter, 
ghrelin, NPW, NPB, C3d, prokinetican, thyroid stimulating hormone, luteinizing 
hormone, follicle stimulating hormone, prolactin, growth hormone, beta-lipotropin, 
melatonin, kallikriens, kinins, prostaglandins, erythropoietin, pi 46 (SEQ ID NO: 19, 

20 amino acid sequence, SEQ ID NO:20, nucleotide sequence), thymic hormones, 
connective tissue proteins, nuclear proteins, actin, avidin, activin, agrin, albumin, and 
prohormones, propeptides, splice variants, fragments and analogs thereof. 

Other desired proteins that may be made by the transgenic animals of the 
present invention include bacitracin, polymixin b, vancomycin, cyclosporine, anti- 

25 RSV antibody, alpha- 1 antitrypsin (AAT), anti-cytomegalovims antibody, anti- 
hepatitis antibody, anti-inhibitor coagulant complex, anti-rabies antibody, anti-Rh(D) 
antibody, adenosine deaminase, anti-^iigoxin antibody, antivenin crotalidae 
(rattlesnake venom antibody), antivenin latrodectus (black widow spider venom 
antibody), antivenin micnirus (coral snake venom antibody), aprotinin, corticotropin 

30 (ACTH), diphtheria antitoxin, lymphocyte immune globulin (anti-thymocyte 
antibody), protamine, thyrotropin, capreomycin, a-galactosidase, gramicidin, 
streptokinase, tetanus toxoid, tyrothricin, IGF-1, proteins of varicella vaccine, anti- 
TNF antibody, anti-IL-2r antibody, anti-HER-2 antibody, 0KT3 ("muromonab- 
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CD3") antibody, TOF-IgG fusion protein, ReoPro ("abciximab^') antibody, ACTH 
fragment 1-24, desmopressin, gonadotropin-releasing hormone, histrelin, leuprolide, 
lypressin, nafarelin, peptide that binds GPIIb/GPIIIa on platelets (integrilin), 
goserelin, capreomycin, colistin, anti-respiratory syncytial virus, lymphocyte immune 
5 globulin (Thymoglovin, Atgamj, panorex, alpha-antitrypsin, botulinin, lung surfactant 
protein, tumor necrosis receptor-IgC fusion protein (enbrel), gonadorelin, proteins of 
influenza vaccine, proteins of rotavirus vaccine, proteins of haemophilus b conjugate 
vaccine, proteins of poliovirus vaccine, proteins of pneumococcal conjugate vaccine, 
proteins of meningococcal C vaccine, proteins of influenza vaccine, megakaryocyte 

10 growth and development factor (MGDF), neuroimmunophilin ligand-A (NIL-A), 
brain-derived neurotrophic factor (BDNF), glial cell line-derived neurotrophic factor 
(GDNF), leptin (native), leptin B, leptin C, IL-IRA (interleukin-lRA), R-568, novel 
erythropoiesis-stimulating protein (NESP), humanized mAb to rous sarcoma virus 
(MEDM93), glutamyl-tryptophan dipeptide IM862, LFA-3TIP immunosuppressive, 

15 humanized anti-CD40-ligand monoclonal antibody (5c8), gelsonin enzyme, tissue 
factor pathway inhibitor (TFPI), proteins of meningitis B vaccine, antimetastatic 
cancer antibody (mAb 17-1 A), chimeric (human & mouse) mAb against TNFa, mAb 
against factor VII, relaxin, capreomycin, glycopeptide (LY333328), recombinant 
human activated protein C (rhAFC), humanized mAb against the epidermal growth 

20 receptor-2, altepase» anti-CD20 antigen, C2B8 antibody, insulin-like growth factor-I, 
atrial natriuretic peptide (anaritide), tenectaplase, anti-CDlla antibody (hu 1124), 
anti-CD18 antibody, mAb LDP-02, anti-VEGF antibody, fab fragment of anti-VEGF 
Ab, AP02 ligand (tumor necrosis factor-related apoptosis-inducing ligand), rTGF-p 
(transforming growth factor-P), alpha-antitrypsin, ananain (a pineapple enzyme), 

25 humanized mAb CTLA4IG, PRO 542 (mAb), D2E7 (mAb), calf intestine alkaline 
phosphatase, a-L-iduronidase, a-L-galactosidase (humanglutamic acid decarboxylase, 
acid sphingomyelinase, bone moiphogenetic protein-2 (rhBMP-2), proteins of HIV 
vaccine, T cell receptor (TCR) peptide vaccine, TCR peptides, V beta 3 and V beta 
13.1. (IR502), (IR501), BI 1050/1272 mAb against very late antigen.4 (VLA-4), 

30 C225 humanized mAb to EGF receptor, anti-idiotype antibody to GD3 glycolipid, 
antibacterial peptide against K pylori, MDX-447 bispecific humanized mAb to EGF 
receptor, anti-cytomegalovirus (CMV), Medi-491 B19 parvovirus vaccine, humanized 
recombinant mAb (IgGlk) against respiratory syncytial vims (RSV), urinary tract 
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irvfection vaccine (against "pill" on Escherechta toli strains), proteins of lyme disease 
vaccine against B. burgdorferi protein (DbpA), proteins of Medi-501 human 
papilloma virus- 11 vaccine (HPV), Streptococcus pneumoniae vaccine, Medi-507 
mAb (humanized form of BTI-322) against CD2 receptor on T-cells, MDX-33 mAb 
5 to FcyRl receptor, MDX-RA immunotoxin (ricin A linked) mAb, MDX-210 bi- 
specific mAb against HER-2, MDX-447 bi-specific mAb against EGF receptor, 
MDX-22, MDX-220 bi-specific mAb against TAG-72 on tumors, colony-stimulating 
factor (CSF) (molgramostim), humanized mAb to the IL-2 R a-chain (basiliximab), 
mAb to IgE (IGE 025 A), myelin basic protein-altered peptide (MSP771A), 

10 humanized mAb against the epidermal growth receptor-2, humanized mAb against the 
a subunit of the interleukin-2 receptor, low molecular weight heparin, anti- 
hemophillic factor, and bactencidal/permeability-increasing protein (r-BPI). 

The peptides and proteins made using the present invention may be labeled 
using labels and techniques known to one of ordinary skill in the art. Some of these 

15 labels are described in the "Handbook of Fluorescent Probes and Research Products'*, 
ninth edition, Richard P. Haugland (ed) Molecular Probes, Inc. Eugene, OR), which is 
incorporated herein in its entirety. Some of these labels may be genetically 
engineered into the polynucleotide sequence for the expression of the selected protein 
or peptide. The peptides and proteins may also have label-incorporation "handles" 

20 incorporated to allow labeling of an otherwise difficult or impossible to label protein. 

It is to be understood that the various classes of desired peptides and proteins, 
as well as specific peptides and proteins described in this section may be modified as 
described below by inserting selected codons for desired amino acid substitutions into 
the gene incorporated into the transgenic animal. 

25 The present invention may also be used to produce desired molecules other 

than proteins and peptides including, but not limited to, lipoproteins such as high 
density lipoprotein (HDL), HDL-Milano, and low density lipoprotein, lipids, 
carbohydrates, siRNA and ribozymes. In these embodiments, a gene of interest 
encodes a nucleic acid molecule or a protein that directs production of the desired 

30 molecule. 

The present invention further encompasses the use of inhibitory molecules to 
inhibit endogenous (i.e., non-vector) protein production. These inhibitory molecules 
include antisense nucleic acids, siRNA and inhibitory proteins. In one embodiment, a 
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transposon-based vector containing an' ovalbumin- DNA sequence, that upon 
transcription forms a double stranded RNA molecule, is transfected into an animal 
such as a bird and the bird's production of endogenous ovalbumin protein is reduced 
by the interference RNA mechanism (RNAi). Additionally, inducible knockouts or 
5 knockdowns of the endogenous protein may be created to achieve a reduction or 
inhibition of endogenous protein production. 

Modified Desired Proteins and Peptides 

"Proteins", "peptides," "polypeptides" and "oligopeptides" are chains of amino 
acids (typically L-amino acids) whose alpha carbons are linked through peptide bonds 

10 formed by a condensation reaction between the carboxyl group of the alpha carbon of 
one amino acid and the amino group of the alpha carbon of another amino acid. The 
terminal amino acid at one end of the chain (i.e., the amino terminal) has a free amino 
group, while the terminal amino acid at the other end of the chain (i.e., the carboxy 
terminal) has a free carboxyl group. As such, the term "amino terminus" (abbreviated 

15 N-teiminus) refers to the free alpha-amino group on the amino acid at the amino 
terminal of the protein, or to the alpha-amino group (imino group when participating 
in a peptide bond) of an amino acid at any other location within the protein. 
Similarly, the term "carboxy terminus" (abbreviated C-terminus) refers to the free 
carboxyl group on the amino acid at the carboxy terminus of a protein, or to the 

20 carboxyl group of an amino acid at any other location within the protein. 

Typically, the amino acids making up a protein are numbered in order, starting 
at the amino terminal and increasing in the direction toward the carboxy terminal of 
the protein. Thus, when one amino acid is said to "follow" another, that amino acid is 
positioned closer to the carboxy terminal of the protein than the preceding amino acid. 

25 The term "residue" is used herein to refer to an amino acid (D or L) or an 

amino acid mimetic that is incorporated into a protein by an amide bond. As such, the 
amino acid may be a naturally occurring amino acid or, unless otherwise limited, may 
encompass known analogs of natural amino acids that function in a manner similar to 
the naturally occurring amino acids (i.e., amino acid mimetics). Moreover, an amide 

30 bond mimetic includes peptide backbone modifications well known to those skilled in 
the art. 

Furthermore, one of skill will recognize that, as mentioned above, individual 
substitutions, deletions or additions which alter, add or delete a single amino acid or a 
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small percentage of amino acids (typically less than about 5%, more typically less 
than about 1%) in an encoded sequence are conservatively modified variations where 
the alterations result in the substitution of an amino acid with a chemically similar 
amino acid. Conservative substitution tables providing functionally similar amino 
5 acids are well known in the art. The following six groups each contain amino acids 
that are conservative substitutions for one another: 

1) Alanine (A), Serine (S), Threonine (T); 

2) Aspartic acid (D), Glutamic acid (E); 

3) Asparagine (N), Glutamine (Q); 
1 0 4) Arginine (R), Lysine (K); 

5) Isoleucine (1), Leucine (L), Methionine (M), Valine (V); and 

6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). 

A conservative substitution is a substitution in which the substituting amino 
acid (naturally occurring or modified) is structurally related to the amino acid being 

IS substituted, i.e., has about the same size and electronic properties as the amino acid 
being substituted. Thus, the substituting amino acid would have the same or a similar 
functional group in the side chain as the original amino acid. A "conservative 
substitution" also refers to utilizing a substituting amino acid which is identical to the 
amino acid being substituted except that a functional group in the side chain is 

20 protected with a suitable protecting group. 

Suitable protecting groups are described in Green and Wuts, ''Protecting 
Groups in Organic Synthesis", John Wiley and Sons, Chapters 5 and 7, 1991, the 
teachings of which are incorporated herein by reference. Preferred protecting groups 
are those which facilitate transport of the peptide through membranes, for example, by 

25 reducing the hydrophilicity and increasing the lipophilicity of the peptide, and which 
can be cleaved, either by hydrolysis or enzymatically (Ditter et al., 1968. J. Pharm. 
Sci. 57:783; Ditter et al., 1968. J. Pharm. Sci, 57:828; Ditter et al, 1969. J. Pharm. 
Sci. 58:557; King et al., 1987. Biochemistry 26:2294; Lindberg et al., 1989. Drug 
Metabolism and Disposition 17:311; Tunek et al., 1988. Biochem. Pharm. 37:3867; 

30 Anderson et al., 1985 Arch. Biochem. Biophys. 239:538; and Singhal et al., 1987. 
FASEB J. 1:220). Suitable hydroxyl protecting groups include ester, carbonate and 
carbamate protecting groups. Suitable amine protecting groups include acyl groups 
and alkoxy or aryloxy carbonyl groups, as described above for N-terminal protecting 

36 

ATLLIBOa IU492.I 



wo 2004/003157 PCTAJS2003/020389 

. ■ . " .• . . • . . . • • . ■ 

groups. Suitable carboxylic acid protecting groups include aliphatic, benzyl and aryl 

esters, as described below for C-terminal protecting groups. In one embodiment, the 
carboxylic acid group in the side chain of one or more glutamic acid or aspartic acid 
residues in a peptide of the present invention is protected, preferably as a methyl, 
5 ethyl, benzyl or substituted benzyl ester, more preferably as a benzyl ester. 

Provided below are groups of naturally occurring and modified amino acids in 
which each amino acid in a group has similar electronic and steric properties. Thus, a 
conservative substitution can be made by substituting an amino acid with another 
amino acid from the same group. It is to be understood that these groups are non- 
10 limiting, i.e. that there are additional modified amino acids which could be included in 
each group. 

Group I includes leucine, isoleucine, valine, methionine and modified amino acids 
having the following side chains: ethyl, n-propyl n-butyl. Preferably, Group I 
includes leucine, isoleucine, valine and methionine. 

IS Group II includes glycine, alanine, valine and a modified amino acid having an ethyl 

side chain. Preferably, Group II includes glycine and alanine. 
Group III includes phenylalanine, phenylglycine, tyrosine, tryptophan, 
cyclohexylmethyl glycine, and modified amino residues having substituted 
benzyl or phenyl side chains. Preferred substituents include one or more of 

20 the following: halogen, methyl, ethyl, nitro, — NH2, methoxy, ethoxy and — 

CN. Preferably, Group m includes phenylalanine, tyrosine and tryptophan. 
Group IV includes glutamic acid, aspartic acid, a substituted or unsubstituted 
aliphatic, aromatic or benzylic ester of glutamic or aspartic acid (e.g., methyl, 
ethyl, n-propyl iso-propyl, cyclohexyl, benzyl or substituted benzyl), 

25 glutamine, asparagine, — CO — ^NH — alkylated glutamine or asparagines (e.g., 

methyl, ethyl, n-propyl and iso-propyl) and modified amino acids having the 
side chain —(CHz)^ — COOH, an ester thereof (substituted or unsubstituted 
aliphatic, aromatic or benzylic ester), an amide thereof and a substituted or 
unsubstituted N-alkylated amide thereof Preferably, Group IV includes 

30 glutamic acid, aspartic acid, methyl aspartate, ethyl aspartate, benzyl aspartate 

and methyl glutamate, ethyl glutamate and benzyl glutamate, glutamine and 
asparagine. 
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Group V includes histidine, lysine, ornithine, arginine, N-nitroarginine, B- 
cycloarginine, y-hydroxyarginine, N-amidinocitruline and 2-amino-4- 
guanidinobutanoic acid, homologs of lysine, homologs of arginine and 
homologs of ornithine. Preferably, Group V includes histidine, lysine, 
5 arginine and ornithine. A homolog of an amino acid includes from 1 to about 

3 additional or subtracted methylene units in the side chain. 
Group VI includes serine, threonine, cysteine and modified amino acids having CI- 
C5 straight or branched alkyi side chains substituted with — OH or — SH, for 
example, — CH2CH2OH, — CH2CH2CH2OH or -CH2CH2OHCH3. Preferably, 
1 0 Group VI includes serine, cysteine or threonine. 

In another aspect, suitable substitutions for amino acid residues include 
"severe" substitutions. A ''severe substitution" is a substitution in which the 
substituting amino acid (naturally occurring or modified) has significantly different 
size and/or electronic properties compared with the amino acid being substituted. 
15 Thus, the side chain of the substituting amino acid can be significantly larger (or 
smaller) than the side chain of the amino acid being substituted and/or can have 
functional groups with significantly different electronic properties than the amino acid 
being substituted. Examples of severe substitutions of this type include the 
substitution of phenylalanine or cyclohexylmethyl glycine for alanine, isoleucine for 
20 glycine, a D amino acid for the corresponding L amino acid, or — NH — CH[( — 
CH2)s — COOH] — CO — for aspartic acid. Alternatively, a functional group may be 
added to the side chain, deleted firom the side chain or exchanged with another 
functional group. Examples of severe substitutions of this type include adding of 
valine, leucine or isoleucine, exchanging the carboxylic acid in the side chain of 
25 aspartic acid or glutamic acid with an amine, or deleting the amine group in the side 
chain of lysine or ornithine. In yet another alternative^ the side chain of the 
substituting amino acid can have significantly different steric and electronic properties 
that the functional group of the amino acid being substituted. Examples of such 
modifications include tryptophan for glycine, lysine for aspartic acid and — 
30 (CH2)4COOH for the side chain of serine. These examples are not meant to be 
limiting. 
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In another embodiment, for example in the synthesis of a peptide 26 amino 
acids in length, the individual amino acids may be substituted according in the 
following manner: 

AAi is serine, glycine, alanine, cysteine or threonine; 
5 AA2 is alanine, threonine, glycine, cysteine or serine; 

AA3 is valine, arginine, leucine, isoleucine, methionine, ornithine, lysine, N- 
nitroarginine, B-cycloarginine, y-hydroxyarginine, N-amidinocitiuline or 2-ammo-4- 
guanidinobutanoic acid; 

AA4 is proline, leucine, valine, isoleucine or methionine; 
10 AA5 is tryptophan, alanine, phenylalanine, tyrosine or glycine; 

AAg is serine, glycine, alanine, cysteine or threonine; 

AA7 is proline, leucine, valine, isoleucine or methionine; 

AAg is alanine, threonine, glycine, cysteine or serine; 

AA9 is alanine, threonine, glycine, cysteine or serine; 
15 AAio is leucine, isoleucine, methionine or valine; 

AA] 1 is serine, glycine, alanine, cysteine or threonine; 

AA12 is leucine, isoleucine, methionine or valine; 

AAiais leucine, isoleucine, methionine or valine; 

AA14 is glutamine, glutamic acid, aspartic acid, asparagine, or a substituted or 
20 unsubstituted aliphatic or aryl ester of glutamic acid or aspartic acid; 

AAis is arginine, N-nitroarginine, &-cycloarginine, y-hydroxy-arginine, N- 

amidinocitruline or 2'amino-4-guanidino-butanoic acid 

AAieis proline, leucine, valine, isoleucine or methionine; 

AA17 is serine, glycine, alanine, cysteine or threonine; 
25 AA|8 is glutamic acid, aspartic acid, asparagine, glutamine or a substituted or 

unsubstituted aliphatic or aryl ester of glutamic acid or aspartic acid; 

AA19 is aspartic acid, asparagine, glutamic acid, glutamine, leucine, valine, isoleucine, 

methionine or a substituted or unsubstituted aliphatic or aryl ester of glutamic acid or 

aspartic acid; 

30 AA20 is valine, arginine, leucine, isoleucine, methionine, ornithine, lysine, N- 
nitroarginine, Q-cycloarginine, y-hydroxyarginine, N-amidinocitruline or 2-amino-4- 
guanidinobutanoic acid; 

AA21 is alanine, threonine, glycine, cysteine or serine; 
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AA2i IS alanine, threonine, glycine, cysteine or serine; 
AA23 is histidine, serine, threonine, cysteine, lysine or ornithine; 
AA24 is threonine, aspartic acid, serine, glutamic acid or a substituted or unsubstituted 
aliphatic or aryl ester of glutamic acid or aspartic acid; 
■ 5 AA25 is asparagine, aspartic acid,, glutamic acid, glutamine, leucine, valine, 
isoleucine, methionine or a substituted or unsubstituted aliphatic or aryl ester of 
glutamic acid or aspartic acid; and 

AA26 is cysteine, histidine, serine, threonine, lysine or ornithine. 

It is to be understood that these amino acid substitutions may be made for 
10 longer or shorter peptides than the 26 mer in the preceding example above, and for 
proteins. 

In one embodiment of the present invention, codons for the first several N- 
terminal amino acids of the transposase are modified such that the third base of each 
codon is changed to an A or a T without changing the corresponding amino acid. It is 

15 preferable that between approximately I and 20, more preferably 3 and 15, and most 
preferably between 4 and 12 of the first N-terminal codons of the gene of interest are 
modified such that the third base of each codon is changed to an A or a T without 
changing the corresponding amino acid. In one embodiment, the first ten N-terminal 
codons of the gene of interest are modified in this manner. 

20 When several desired proteins, protein fragments or peptides, are encoded in 

the gene of interest to be incorporated into the genome, one of skill in the art will 
appreciate that the proteins, protein fragments or peptides may be separated by a 
spacer molecule such as, for example, a peptide, consisting of one or more amino 
acids. Generally, the spacer will have no specific biological activity other than to join 

25 the desired proteins, protein fragments or peptides together, or to preserve some 
minimum distance or other spatial relationship between them. However, the 
constituent amino acids of the spacer may be selected to influence some property of 
the molecule such as the folding, net charge, or hydrophobicity. The spacer may also 
be contained within a nucleotide sequence with a purification handle or be flanked by 

30 proteolytic cleavage sites. 

Such polypeptide spacers may have from about S to about 40 amino acid 
residues. The spacers in a polypeptide are independently chosen, but are preferably 
all the same. The spacers should allow for flexibility of movement in space and are 
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therefore typically rich in small amino acids, for example, glycine, serine, proline or 
alanine. Preferably, peptide spacers contain at least 60%, more preferably at least 
80% glycine or alanine. In addition, peptide spacers generally have little or no 
biological and antigenic activity. Preferred spacers are (Gly-Pro-Gly-Gly)x (SEQ ID 
5 N0:5) and (Gly4-Ser)y, wherein x is an integer from about 3 to about 9 and y is an 
integer from about 1 to about 8. Specific examples of suitable spacers include 
(Gly-Pro-Gly-Gly)3 

SEQ ID N0:6 Gly Pro Gly Gly Gly Pro Gly Gly Gly Pro Gly Gly 
(Gly4-Ser)3 

1 0 SEQ ID N0:7 Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
or (Gly4-Ser)4 

SEQ ID N0:8 Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
Gly Gly Gly Gly Ser. 

Nucleotide sequences encoding for the production of residues which may be 

15 usefiil in purification of the expressed recombinant protein may also be built into the 
vector. Such sequences are known in the art and include the glutathione binding 
domain from glutathione S-transferase, polylysine, hexa-histidine or other cationic 
amino acids, thioredoxin, hemagglutinin antigen and maltose binding protein. 

Additionally, nucleotide sequences may be inserted into the gene of interest to 

20 • be incorporated so that the protein or peptide can also include from one to about six 
amino acids that create signals for proteolytic cleavage. In this manner, if a gene is 
designed to make one or more peptides or proteins of interest in the transgenic animal, 
specific nucleotide sequences encoding for amino acids recognized by enzymes may 
be incorporated into the gene to facilitate cleavage of the large protein or peptide 

25 sequence into desired peptides or proteins or both. For example, nucleotides encoding 
a proteolytic cleavage site can be introduced into Qie gene of interest so that a signal 
sequence can be cleaved from a protein or peptide encoded by the gene of interest. 
Nucleotide sequences encoding other amino acid sequences which display pH 
sensitivity or chemical sensitivity may also be added to the vector to facilitate 

30 separation of the signal sequence from the peptide or protein of interest. 

In one embodiment of the present invention, a TAG sequence is linked to the 
gene of interest. The TAG sequence serves three purposes: 1) it allows free rotation 
of the peptide or protein to be isolated so there is no interference from the native 
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protein or signal sequence, i.e. vitellogenin, 2) it provides a "purification handle" to 
isolate the protein using column purification, and 3) it includes a cleavage site to 
remove the desired protein from the signal and purification sequences. Accordingly, 
as used herein, a TAG sequence includes a spacer sequence, a purification handle and 
5 a cleavage site. The spacer sequences in the TAG proteins contain one or more 
repeats shown in SEQ ID NO:25. A preferred spacer sequence comprises the 
sequence provided in SEQ ID NO:26. One example of a purification handle is the 
gp41 hairpin loop from HIV 1. Exemplary gp41 polynucleotide and polypeptide 
sequences are provided in SEQ ID NO:24 and SEQ ID NO:23, respectively. 

10 However, it should be understood that any antigenic region may be used as a 
purification handle, including any antigenic region of gp41. Preferred purification 
handles are those that elicit highly specific antibodies. Additionally, the cleavage site 
can be any protein cleavage site known to one of ordinary skill in the art and includes 
an enterokinase cleavage site comprising the Asp Asp Asp Asp Lys sequence (SEQ 

15 ID N0:9) and a furin cleavage site. Constructs containing a TAG sequence are shown 
in Figures 2 and 3. In one embodiment of the present invention, the TAG sequence 
comprises a polynucleotide sequence of SEQ ID NO:22. 
Methods of Administerinig Transposon-Based Vectors 

In addition to the transposon-based vectors described above, the present 

20 invention also includes methods of administering the transposon-based vectors to an 
animal, methods of producing a transgenic ammal wherein a gene of interest is 
incorporated into the germline of the animal and methods of producing a transgenic 
animal wherein a gene of interest is incoq)orated into cells other than the germline 
cells of the animal. The transposon-based vectors of the present invention may be 

25 administered to an animal via any method known to those of skill in the art, including, 
but not limited to, intraembryonic, intratesticular, intraoviduct, intraperitoneal, 
intraarterial, intravenous, topical, oral, nasal, and pronuclear injection methods of 
administration, or any combination thereof. The transposon-based vectors may also 
be administered within the lumen of an organ, into an organ, into a body cavity, into 

30 the cerebrospinal fluid, through the urinary system or through any route to reach the 
desired cells. 

The transposon-based vectors may be delivered through the vascular system to 
be distributed to the cells supplied by that vessel. For example, the compositions may 
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be placed in the artery supplying the ovary or supplying the fallopian tube to transfect 
cells in those tissues. In this manner, follicles could be transfected to create a 
germline transgenic animal. Alternatively, supplying the compositions through the 
artery leading to the oviduct would preferably transfect the tubular gland and 
5 epithelial cells. Such transfected cells could manufacture a desired protein or peptide 
for deposition in the egg white. Administration of the compositions through the portal 
vein would target uptake and transformation of hepatic cells. Administration through 
the urethra and into the bladder would target the transitional epithelium of the bladder. 
Administration through the vagina and cervix would target the lining of the uterus. 

1 0 Administration through the internal mammaiy artery would transfect secretory cells of 
the lactating mammary gland to perform a desired function, such as to synthesize and 
secrete a desired protein or peptide into the milk. 

In a preferred embodiment, the animal is an egg-laying animal, and more 
preferably, an avian. In one embodiment, between approximately 1 and 50 |ig, 

15 preferably between 1 and 20 ng, and more preferably between 5 and 10 (ig of 
transposon-based vector DNA is administered to the oviduct of a bird. Optimal 
ranges depending upon the type of bird and the bird's stage of sexual maturity. 
Intraoviduct administration of the transposon-based vectors of the present invention 
result in a PCR positive signal in the oviduct tissue, whereas intravascular 

20 administration results in a PCR positive signal in the liver. In other embodiments, the 
transposon-based vector is administered to an artery that supplies the oviduct or the 
liver. These methods of administration may also be combined with any methods for 
facilitating transfection, including without limitation, electroporation, gene guns, 
injection of naked DNA, and use of dimethyl sulfoxide (DMSO). 

25 The present invention includes a method of intraembryonic administration of a 

transposon-based vector to an avian embryo comprising the following steps: 1) 
incubating an egg on its side at room temperature for two hours to allow the embryo 
contained therein to move to top dead center (TDC); 2) drilling a hole through the 
shell without penetrating the underlying shell membrane; 3) injecting the embryo with 

30 the transposon-based vector in solution; 4) sealing the hole in the egg; and 5) placing 
the egg in an incubator for hatching. Administration of the transposon-based vector 
can occur anytime between immediately after egg lay (when the embryo is at Stage X) 
and hatching. Preferably, the transposon-based vector is administered between 1 and 
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7 days after egg lay, more preferably, between l and 2 days after egg lay. The 
transposon-based vectors may be introduced into the embryo in amounts ranging ft-om 
about 5.0 ^g to 10 pg, preferably 1.0 ^g to 100 pg. Additionally, the transposon- 
based vector solution volume may be between approximately I ^il to 75 ^1 in quail 
5 and between approximately 1 ^il to 500 ^1 in chicken. 

The present invention also includes a method of intratesticular administration 
of a transposon-based vector including injecting a bird with a composition comprising 
the transposon-based vector, an appropriate carrier and an appropriate transfection 
reagent. In one embodiment, the bird is injected before sexual maturity, preferably 

10 between approximately 4-14 weeks, more preferably between approximately 6-14 
weeks and most preferably between 8-12 weeks old. In another embodiment, a 
mature bird is injected with a transposon-based vector an appropriate carrier and an 
appropriate transfection reagent. The mature bird may be any type of bird, but in one 
example the mature bird is a quail. 

15 A bird is preferably injected prior to the development of the blood-testis 

barrier, which thereby facilitates entry of the transposon-based vector into the 
seminiferous tubules and transfection of the spermatogonia or other germline cells. 
At and between the ages of 4, 6, 8, 10, 12, and 14 weeks, it is believed that the testes 
of chickens are likely to be most receptive to transfection. In this age range, the 

20 blood/testis barrier has not yet formed, and there is a relatively hi^ number of 
spermatogonia relative to the numbers of other cell types, e.g., spermatids, etc. See J. 
Kumaran et al., 1949. Poultry Sci., 29:511-520. See also E. Oakberg, 1956. Am. J. 
Anatomy, 99:507-515; and P. Kluin et al., 1984. Anat. Embryol, 169:73-78. 

The transposon-based vectors may be introduced into a testis in an amount 

25 ranging from about 0.1 ng to 10 ^ig, preferably 1 pg to 10 pg, more preferably 3 pg to 
10 pg. In a quail, about 5 pg is a preferred amount. In a chicken, about 5 pg to 10 pg 
per testis is preferred. These amounts of vector DNA may be injected in one dose or 
multiple doses and at one site or multiple sites in the testis. In a preferred 
embodiment, the vector DNA is administered at multiple sites in a single testis, both 

30 testes being injected in this manner. In one embodiment, injection is spread over 
three injection sites: one at each end of the testis, and one in the middle. Additionally, 
the transposon-based vector solution volume may be between approximately 1 pi to 
75 pi in quail and between approximately 1 pi to 500 pi in chicken. In a preferred 
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embodiment, the transposbn-based vector ' solution volume may be between 

approximately 20 fil to 60 ^1 in quail and between approximately SO fil to 250 ^1 in 

chicken. Both the amount of vector DNA and the total volume injected into each 

testis may be determined based upon the age and size of the bird. 

5 According to the present invention, the transposon-based vector is 

administered in conjunction with an acceptable carrier and/or transfection reagent. 

Acceptable carriers include, but are not limited to, water, saline, Hanks Balanced Salt 

Solution (HBSS), Tris-EDTA (TE) and lyotropic liquid crystals. Transfection 

reagents commonly known to one of ordinary skill in the art that may be employed 

10 include, but are not limited to, the following: cationic lipid transfection reagents, 
cationic lipid mixtures, polyamine reagents, liposomes and combinations thereof; 
SUPERFECT®, Cytofectene, BioPORTER®, GenePORTER®, NeuroPORTER®, 
and perfectin from Gene Therapy Systems; lipofectamine, cellfectin, DMRJE-C 
oligofectamine, and PLUS reagent from InVitrogen; Xtreme gene, fiigene, DOSPER 

15 and DOTAP from Roche; Lipotaxi and Genejammer from Stiategene; and Escort 
from SIGMA. In one embodiment, the transfection reagent is SUPERFECT®. The 
ratio of DNA to transfection reagent may vary based upon the method of 
administration. In one embodiment, the transposon-based vector is administered 
intratesticularly and the ratio of DNA to transfection reagent can be from 1:1.5 to 

20 1:15, preferably 1:2 to 1:10, all expressed as wt/vol. Transfection may also be 
accomplished using other means known to one of ordinary skill in the art, including 
without limitation electroporation, gene guns, injection of naked DNA, and use of 
dimethyl sulfoxide (DMSO). 

Depending upon the cell or tissue type targeted for transfection, the form of 

25 the transposon-based vector may be important. Plasmids harvested from bacteria are 
generally closed circular supercoiled molecules, and this is the preferred state of a 
vector for gene delivery because of the ease of preparation. In some instances, 
transposase expression and insertion may be more efficient in a relaxed, closed 
circular configuration or in a linear configuration. In still other instances, a purified 

30 transposase protein may be co-injected with a transposon-based vector containing the 
gene of interest for more immediate insertion. This could be accomplished by using a 
transfection reagent complexed with both the purified transposase protein and the 
transposon-based vector. 
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Testing for and Breeding Animals Carrying the Transeene 

Following administration of a transposon-based vector to an animal, DNA is 
extracted from the animal to confirm integration of the gene of interest. Actual 
frequencies of integration are estimated both by comparative strength of the PGR 
5 signal, and by histological evaluation of the tissues by quantitative PGR. Another 
method for estimating the rate of transgene insertion is the so-called primed in situ 
hybridization technique (PRINS). This method determines not only which cells carry 
a transgene of interest, but also into which chromosome the gene has inserted, and 
even what portion of the chromosome. Briefly, labeled primers are annealed to 

10 chromosome spreads (affixed to glass slides) through one round of PGR, and the 
slides are then developed through normal in situ hybridization procedures. This 
technique combines the best features of in situ PGR and fluorescence in situ 
hybridization (FISH) to provide distinct chromosome location and copy number of the 
gene in question. The 28s rRNA gene will be used as a positive control for 

1 5 spermatogonia to confirm that the technique is functioning properly. Using different 
fluorescent labels for the transgene and the 28s gene causes cells containing a 
transgene to fluoresce with two different colored tags. 

Breeding experiments are also conducted to determine if germline 
transmission of the transgene has occurred. In a general bird breeding experiment 

20 performed according to the present invention, each male bird was exposed to 2-3 
different adult female birds for 3-4 d^ys each. This procedure was cdntinued with 
different females for a total period of 6-12 weeks. Eggs were collected daily for up to 
14 days after the last exposure to the transgenic male, and each egg was incubated in a 
standard incubator. In the first series of experiments the resulting embryos were 

25 examined for transgene presence at day 3 or 4 using PGR. 

Any male producing a transgenic embryo was bred to additional females. 
Eggs from these females were incubated, hatched, and the chicks tested for the 
exogenous DNA. Any embryos that died were necropsied and examined directly for 
the transgene or protein encoded by the transgene, either by fluorescence or PGR. 

30 The offspring that hatched and were found to be positive for the exogenous DNA 
were raised to maturity. These birds were bred to produce further generations of 
transgenic birds, to verify efficiency of the transgenic procedure and the stable 
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incorporation of the transgene into the germ line. The resulting embryos were 
examined for transgene presence at day 3 or 4 using PCR. 

It is to be understood that the above procedure can be modified to suit animals 
other than birds and that selective breeding techniques may be performed to amplify 
5 gene copy numbers and protein output. 

Production of Desired Proteins or Peptides in Egg White 

In one embodiment, the transposon-based vectors of the present invention may 
be administered to a bird for production of desired proteins or peptides in the egg 
white. These trasnposon-based vectors preferably contain one or more of an 

10 ovalbumin promoter, an ovomucoid promoter, an ovalbumin signal sequence and an 
ovomucoid signal sequence. Oviduct-specific ovalbumin promoters are described in 
B. O'Malley et al, 1987. EMBO J., vol. 6, pp. 2305-12; A. Qiu et al., 1994, Proc. Nat. 
Acad. Sci. (USA), vol. 91, pp. 4451-4455; D. Monroe et al., 2000. Biochim. Biophys. 
Acta, 1517 (l):27-32; H. Park et al., 2000. Biochem., 39:8537-8545; and T. 

15 Muramatsu et al., 1996. Poult. Avian Biol. Rev., 6: 107-123. Examples of transposon- 
based vectors designed for production of a desired protein in an egg white are shown 
in Figures 2 and 3. 

Production of Desired Proteins or Peptides in Egg Yolk 

The present invention is particularly advantageous for production of 

20 recombinant peptides and proteins of low solubility in the egg yolk. Such proteins 
include, but are not limited to, membrane-associated or membrane-bound proteins, 
lipophilic compounds; attachment factors, receptors, and components of second 
messenger transduction machinery. Low solubility peptides and proteins are 
particularly challenging to produce using conventional recombinant protein 

25 production techniques (cell and tissue cultures) because they aggregate in water- 
based, hydrophilic environments. Such aggregation necessitates denaturation and re- 
folding of the recombinantly-produced proteins, which may deleteriously affect their 
structure and function. Moreover, even highly soluble recombinant peptides and 
proteins may precipitate and require denaturation and renaturation when produced in 

30 sufficiently high amounts in recombinant protein production systems. The present 
invention provides an advantageous resolution of the problem of protein and peptide 
solubility during production of large amounts of recombinant proteins. 
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In onfc embodiment of the present invention, deposition of a desired protein 
into the egg yolk is accomplished by attaching a sequence encoding a protein capable 
of binding to the yolk vitellogenin receptor to a gene of interest that encodes a desired 
protein. This transposon-based vector can be used for the receptor-mediated uptake 
5 of the desired protein by the oocytes. In a preferred embodiment, the sequence 
ensuring the binding to the vitellogenin receptor is a targeting sequence of a 
vitellogenin protein. The invention encompasses various vitellogenin proteins and 
their targeting sequences. In a preferred embodiment, a chicken vitellogenin protein 
targeting sequence is used, however, due to the high degree of conservation among 

10 vitellogenin protein sequences and known cross-species reactivity of vitellogenin 
targeting sequences with their egg-yolk receptors, other vitellogenin targeting 
sequences can be substituted. One example of a construct for use in the transposon- 
based vectors of the present invention and for deposition of an insulin protein in an 
egg yolk is provided in SEQ ID NO:27. In this embodiment, the transposon-based 

15 vector contains a vitellogenin promoter, a vitellogenin targeting sequence, a TAG 
sequence, a pro-insulin sequence and a synthetic polyA sequence. The present 
invention includes, but is not limited to, vitellogenin targeting sequences residing in 
the N-terminal domain of vitellogenin, particularly in lipovitellin I. In one 
embodiment, the vitellogenin targeting sequence contains the polynucleotide 

20 sequence of SEQ ID NO: 1 8. 

In a preferred embodiment, the transposon-based vector contains a transpbsase 
gene operably-linked to a liver-specific promoter and a gene of interest opeiably- 
linked to a liver-specific promoter and a vitellogenin targeting sequence. Figure 4 
shows an example of such a construct. In another preferred embodiment, the 

25 transposon-based vector contains a transposase gene operably-linked to a constitutive 
promoter and a gene of interest operably-linked to a liver-specific promoter and a 
vitellogenin targeting sequence. 
Isolation and Purification of Desired Protein or Peptide 

For large-scale production of protein, an animal breeding stock diat is 

30 homozygous for the transgene is preferred. Such homozygous individuals are 
obtained and identified through, for example, standard animal breeding procedures or 
PCR protocols. 
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' Once expressed, peptides, polypeptides and proteins can be purified according 
to standard procedures Icnown to one of ordinary skill in the art, including ammonium 
sulfate precipitation, affinity columns, column chromatography, gel electrophoresis, 
high performance liquid chromatography, immunoprecipitation and the like. 
5 Substantially pure compositions of about 50 to 99% homogeneity are preferred, and 
80 to 95% or greater homogeneity are most preferred for use as therapeutic agents. 

In one embodiment of the present invention, the animal in which the desired 
protein is produced is an egg-laying animal. In a preferred embodiment of the present 
invention, the animal is an avian and a desired peptide, polypeptide or protein is 

10 isolated from an egg white. Egg white containing the exogenous protein or peptide is 
separated from the yolk and other egg constituents on an industrial scale by any of a 
variety of methods knov^ in the egg industry. See, e.g., W. Stadelman et al. (Eds.), 
Egg Science & Technology, Ha worth Press, Binghamton, NY (1995), Isolation of the 
exogenous peptide or protein from the other egg white constituents is accomplished 

15 by any of a number of polypeptide isolation and purification methods well known to 
one of ordinary skill in the art. These techniques include, for example, 
chromatographic methods such as gel permeation, ion exchange, affinity separation, 
metal chelation, HPLC, and the like, either alone or in combination. Another means 
that may be used for isolation or purification, either in lieu of or in addition to 

20 chromatographic separation methods, includes electrophoresis. Successful isolation 
and purification is confirmed by standard analytic techniques, including HPLC, mass 
spectroscopy, and spectrophotometry. These separation methods are often facilitated 
if the first step in the separation is the removal of the endogenous ovalbumin fraction 
of egg white, as doing so will reduce the total protein content to be further purified by 

25 about 50%. 

To facilitate or enable purification of a desired protein or peptide, transposon- 
based vectors may include one or more additional epitopes or domains. Such epitopes 
or domains include DNA sequences encoding en^matic or chemical cleavage sites 
including, but not limited to, an enterokinase cleavage site; the glutathione binding 
30 domain from glutathione S-transferase; polylysine; hexa-histidine or other cationic 
amino acids; thioredoxin; hemagglutinin antigen; maltose binding protein; a fragment 
of gp41 from HTV; and other purification epitopes or domains commonly known to 
one of skill in the art. 
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In one representative embodiment,* purification of desired proteins firom egg 
white utilizes the antigenicity of the ovalbumin carrier protein and particular attributes 
of a TAG linker sequence that spans ovalbumin and the desired protein. The TAG 
sequence is particularly useful in this process because it contains 1) a highly antigenic 
5 epitope, a fragment of gp41 from HTV, alloWing for stringent affinity purification, 
and, 2) a recognition site for the protease enterokinase immediately juxtaposed to the 
desired protein. In a preferred embodiment, the TAG sequence comprises 
approximately 50 amino acids. A representative TAG sequence is provided below. 



10 Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp 
Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Thr Thr Cvs He Leu Lvs Glv Ser Cvs 
Glv Trp He Glv Leu Leu Asp Asp Asp Asp Lys (SEQ ID NO:22) 



The underlined sequences were taken from the hairpin loop domain of HIV gp-41 
15 (SEQ ID NO:23). Sequences in italics represent the cleavage site for enterokinase 
(SEQ ID N0:9). The spacer sequence upstream of the loop domain was made from 
repeats of (Pro Ala Asp Asp Ala) (SEQ ID NO:25) to provide free rotation and 
promote surface availability of the hairpin loop from the ovalbumin carrier protein. 
Isolation and purification of a desired protein is performed as follows: 
20 1. Enrichment of the egg white protein firaction containing ovalbumin and the 

transgenic ovalbumin-TAG-desired protein. 

2. Size exclusion chromatography to isolate only those proteins within a narrow 
range of molecular weights (a fiirther enrichment of step 1). 

3. Ovalbumin affinity chromatography. Highly specific antibodies to ovalbumin 
25 will eliminate virtually all extraneous egg white proteins except ovalbumin 

and the transgenic ovalbumin-TAG-desiied protein. 

4. gp41 affinity chromatography using anti-gp41 antibodies. Stringent 
application of this step will result in virtually pure transgenic ovalbumin- 
TAG-desired protein. 

30 5. Cleavage of the transgene product can be accomplished in at least one of two 

ways: 

a. The transgenic ovalbumin-TAG-desired protein is left attached to the 
gp41 affinity resin (beads) from step 4 and the protease enterokinase is 
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added. This liberates the transgene target protein from the gp41 affinity 
resin while the ovalbumin-TAG sequence is retained. Separation by 
centrifugation (in a batch process) or flow through (in a column 
purification), leaves the desired protein together with enterokinase in 
5 solution. Enterokinase is recovered and reused. 

b. Alternatively, enterokinase is immobilized on resin (beads) by the 
addition of poly-lysine moieties to a non-catalytic area of the protease. 
The transgenic ovalbumin-TAG-desired protein eluted from the 
affinity column of step 4 is then applied to the protease resin. Protease 

10 action cleaves the ovalbumin-TAG sequence from the desired protein 

and leaves both entities in solution. The immobilized enterokinase 
resin is recharged and reused. 

c. The choice of these alternatives is made depending upon the size and 
chemical composition of the transgene target protein. 

15 6. A final separation of either of these two (5a or 5b) protein mixtures is made 
using size exclusion, or enterokinase affinity chromatography. This step 
allows for desalting, buffer exchange and/or polishing, as needed. 
Cleavage of the transgene product (ovalbumin-TAG-desired protein) by 
enterokinase, then» results in two products: ovalbumin-TAG and the desired protein. 
20 More specific methods for isolation using the TAG label is provided in the Examples. 
Some desired proteins may require additions or modifications of the above-described 
approach as known to one of ordinary skill in the art The method is scaleable from 
the laboratory bench to pilot and production facility largely because the techniques 
applied are well documented in each of these settings. 
25 It is believed that a typical chicken egg produced by a transgenic animal of the 

present invention will contain at least 0.001 mg, from about 0.001 to 1.0 mg, or from 
about 0.001 to 100.0 mg of exogenous protein, peptide or polypeptide, in addition to 
the normal constituents of egg white (or possibly replacing a small fraction of the 
latter). 

30 One of skill in the art will recognize that after biological expression or 

purification, the desired proteins, fragments thereof and peptides may possess a 
conformation substantially different than the native conformations of the proteins, 
fragments thereof and peptides. In this case, it is often necessary to denature and 
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reduce protein and then to cause the protein to re-fold into the preferred conformation. 
Methods of reducing and denaturing proteins and inducing re-folding are well known 
to those of skill in the art. 

m 

Production of Protein or Peptide in Milk 
5 In addition to methods of producing eggs containing transgenic proteins or 

peptides, the present invention encompasses methods for the production of milk 
containing transgenic proteins or peptides. These methods include the administration 
of a transposon-based vector described above to a mammal. In one embodiment, the 
transposon-based vector contains a transposase operably-linked to a constitutive 
10 promoter and a gene of interest operably-linked to mammary specific promoter. 
Genes of interest can include, but are not limited to antiviral and antibacterial proteins 
and immunoglobulins. 

Treatment of Disease and Animal Improvement 

In addition to production and isolation of desired molecules, the transposon- 

15 based vectors of the present invention can be used for the treatment of various genetic 
disorders. For example, one or more transposon-based vectors can be administered to 
a human or animal for the treatment of a single gene disorder including, but not 
limited to, Huntington's disease, alpha- 1 -antitrypsin deficiency Alzheimer's disease, 
various forms or breast cancer, cystic fibrosis, galactosemia, congenital 

20 hypothyroidism, maple syrup urine disease, neurofibromatosis 1, phenylketonuria, 
sickle cell disease, and Smith-Lemli-Opitz (SLO/RSH) Syndrome. Other diseases 
caused by single gene disorders that may be treated with the present invention 
include, autoinmiune diseases, shipping fever in cattle, mastitis, bacterial or viial 
diseases, alteration of skin pigment in animals. In these embodiments, the 

25 transposon-based vector contains a non-mutated, or non-disease causing form of the 
gene known to cause such disorder. Preferably, the transposase contained within the 
transposase-based vector is operably linked to an inducible promoter such as a tissue- 
specific promoter such that the non-mutated gene of interest is inserted into a specific 
tissue wherein the mutated gene is expressed in vivo. 

30 In one embodiment of the present invention, a transposon-based vector 

comprising a gene encoding proinsulin is administered to diabetic animals or humans 
for incorporation into liver cells in order to treat or cure diabetes. The specific 
incorporation of the proinsulin gene into the liver is accomplished by placing the 
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transposase gene under the control of liver-specific promoter, such as G6P. This 
approach is useful for treatment of both Type I and Type II diabetes. The G6P 
promoter has been shown to be glucose responsive (Arguad, D., et al. 1996. Diabetes 
45: 1563-1571), and thus, glucose-regulated insulin production is achieved using DNA 
5 constructs of the present invention. Integrating a proinsulin gene into liver cells 
circumvents the problem of destruction of pancreatic islet cells in the course of Type I 
diabetes. 

In another embodiment, shortly after diagnosis of Type I diabetes, the cells of 
the immune system destroying pancreatic P-cells are selectively removed using the 

10 transposon-based vectors of the present invention, thus allowing normal p-cells to 
repopulate the pancreas. 

For treatment of Type II diabetes, a transposon-based vector containing a 
proinsulin gene is specifically incorporated into the pancreas by placing the 
transposase gene under the control of a pancreas-specific promoter, such as an insulin 

15 promoter. In this embodiment, the vector is delivered to a diabetic animal or human 
via injection into an artery feeding the pancreas. For delivery, the vector is 
complexed with a transfection agent. The artery distributes the complex throughout 
the pancreas, where individual cells receive the vector DNA. Following uptake into 
the target cell, the insulin promoter is recognized by tnmscriptional machinery of the 

20 cell, the transposase encoded by the vector is expressed, and stable integration of the 
proinsulin gene occurs. It is expected that a small percentage of the transposon-based 
vector is transported to other tissues, and that these tissues are transfected. However, 
these tissues are not stably transfected and the proinsulin gene is not incorporated into 
the cells' DNA due to failure of these cells to activate the insulin promoter. The 

25 vector DNA is likely lost when the cell dies or degraded over time. 

In other embodiments, one or more transposon-based vectors are administered 
to an avian for the treatment of a viral or bacterial infection/disease including, but not 
limited to, Colibacillosis (Coliform infections). Mycoplasmosis (CRD, Air sac, 
Sinusitis), Fowl Cholera, Necrotic Enteritis, Ulcerative Enteritis (Quail disease), 

30 Pullorum Disease, Fowl Typhoid, Botulism, Infectious Coryza, Erysipelas, Avian 
Pox, Newcastle Disease, Infectious Bronchitis, Quail Bronchitis, Lymphoid Leukosis, 
Marek's Disease (Visceral Leukosis), Infectious Bursal Disease (Gu.mboro). In these 

53 

ATUJBQ2 1134911 



wo 2004/0031 57 PCT/US2003/020389 

. • . . - . 

embodiments, the transposon-based vectors may be used in a manner similar to 

traditional vaccines. 

In still other embodiments, one or more transposon-based vectors are 

administered to an animal for the production of an animal with enhanced growth 

S characteristics and nutrient utilization. 

The transposon-based vectors of the present invention can be used to 

transform any animal cell, including but not limited to: cells producing hormones, 

cytokines, growth factors, or any other biologically active substance; cells of the 

immune system; cells of the nervous system; muscle (striatal, cardiac, smooth) cells; 

10 vascular system cells; endothelial cells; skin cells; mammary cells; and lung cells, 
including bronchial and alveolar cells. Transformation of any endocrine cell by a 
transposon-based vector is contemplated as a part of a present invention. In one 
aspect of the present invention, cells of the immune system may be the target for 
incorporation of a desired gene or genes encoding for production of antibodies. 

1 5 Accordingly, the thymus, bone marrow, beta lymphocytes (or B cells), gastrointestinal 
associated lymphatic tissue (GALT), Peyer's patches, bursa Fabricius, lymph nodes, 
spleen, and tonsil, and any other lymphatic tissue, may all be targets for 
administration of the compositions of the present invention. 

The transposon-based vectors of the present invention can be used to modulate 

20 (stimulate or inhibit) production of any substance, including but not limited to a 
hormone, a cytokine, or a growth factor, by an animal or a human cell. Modulation of 
a regulated signal within a cell or a tissue, such as production of a second messenger^ 
is also contemplated as a part of the present invention. Use of the transposon-based 
vectors of the present invention is contemplated for treatment of any animal or human 

25 disease or condition that results from underproduction (such as diabetes) or 
overproduction (such as hyperthyroidism) of a hormone or other endogenous 
biologically active substance. Use of die transposon-based vectors of the present 
invention to integrate nucleotide sequences encoding RNA molecules, such as anti- 
sense RNA or short interfering RNA, is also contemplated as a part of the present 

30 invention. 

Additionally, the transposon-based vectors of the present invention may be 
used to provide cells or tissues with ^'beacons", such as receptor molecules, for 
binding of therapeutic agents in order to provide tissue and cell specificity for the 
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therapeutic agents. Several promoters and exogenous genes can be combined in one 
vector to produce progressive, controlled treatments from a single vector delivery. 

The following examples will serve to further illustrate the present invention 
without, at the same time, however, constituting any limitation thereof. On the 
5 contrary, it is to be clearly understood that resort may be had to various embodiments, 
modifications and equivalents thereof which, after reading the description herein, may 
suggest themselves to those skilled in the art without departing from the spirit of the 
invention, 

1 0 EXAMPLE 1 

Preparation of Transposon-Based Vector pTnMod 

A vector was designed for inserting a desired coding sequence into the 
genome of eukaryotic cells, given below as SEQ ID NO: 1 . The vector of SEQ ID 
NO: 1, termed pTnMod, was constructed and its sequence verified. 

1 5 This vector employed a cytomegalovirus (CMV) promoter. A modified Kozak 

sequence (ACCATG) (SEQ ID NO: 13) was added to the promoter. The nucleotide in 
the wobble position in nucleotide triplet codons encoding the first 10 amino acids of 
transposase was changed to an adenine (A) or thymine (T), which did not alter the 
amino acid encoded by this codon. Two stop codons were added and a synthetic 

20 polyA was used to provide a strong termination sequence. This vector uses a 
promoter designed to be active soon after entering the cell (without any induction) to 
increase the likelihood of stable integration. The additional stop codons and synthetic 
polyA insures proper termination without read through to potential genes 
downstream. 

25 The first step in constructing this vector was to modify the transposase to have 

the desired changes. Modifications to (he transposase were accomplished with the 
primers Hig^ Efficiency forward primer (Hef) Altered transposase (ATS)-Hef S' 
ATCTCGAGACCATGTGIGAAClTGATATTTTACATGAirCTCTTTACC 3' 
(SEQ ID NO: 10) and Altered transposase- High efficiency reverse primer (Her) 5* 

30 GATTGATCATTATCATAATTTCCCCAAAGCGTAACC 3' (SEQ ID N0;11, a 
reverse complement primer). In the 5' forward primer ATS-Hef, the sequence 
CTCGAG (SEQ ID NO: 12) is the recognition site for the restriction enzyme Xho I, 
which permits directional cloning of the amplified gene. The sequence ACCATG 
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(SEQ ID NO: 1 3) contains the Kozak sequence and start codon for the transposase and 
the underlined bases represent changes in the wobble position to an A or T of codons 
for the first 10 amino acids (without changing the amino acid coded by the codon). 
Primer ATS-Her (SEQ ID NO: 11) contains an additional stop codon TAA in addition 
5 to native stop codon TGA and adds a Bel I restriction site, TGATCA (SEQ ID 
NO: 14), to allow directional cloning. These primers were used in a PGR reaction 
with pTnLac (p defines plasmid, tn defines transposon» and lac defines the beta 
fragment of the lactose gene, which contains a multiple cloning site) as the template 
for the transposase and a FailSafe™ PGR System (which includes enzyme, buffers, 

10 dNTP's, MgGl2 and PGR Enhancer; Epicentre Technologies, Madison, WI). 
Amplified PGR product was electrophoresed on a 1% agarose gel, stained with 
ethidium bromide, and visualized on an ultraviolet transilluminator. A band 
corresponding to the expected size was excised from the gel and purified from the 
agarose using a Zymo Clean Gel Recovery Kit (Zymo Research, Orange, GA). 

15 Purified DNA was digested with restriction enzymes Xho I (5') and Bel I (3') (New 
England Biolabs, Beverly, MA) according to the manufacturer's protocol. Digested 
DNA was purified from restriction enzymes using a Zymo DNA Glean and 
Concentrator kit (Zymo Research). 

Plasmid gWhiz (Gene Therapy Systems, San Diego, CA) was digested with 

20 . restriction enzymes Sal I and BamH I (New England Biolabs), which are compatible 
with Xho I and Bel I, but destroy the restriction sites. Digested gWhiz was separated 
on an agarose gel, the desired band excised and purified as described above. Gutting 
the vector in this manner facilitated directional cloning of the modified transposase 
(m ATS) between the GMV promoter and synthetic polyA. 

25 To insert the mATS between the GMV promoter and synthetic polyA in 

gWhiz, a Stratagene T4 Ligase Kit (Stratagene, Inc. La JoUa, GA) was used and the 
ligation set up according to the manufacturer's protocol. Ligated product was 
transformed into E, coli Top 10 competent cells (Invitrogen Life Technologies, 
Carlsbad, GA) using chemical transformation according to Invitrogen's protocol. 

30 Transformed bacteria were incubated in 1 ml of SOG (GIBGO BRL, GAT# 15544- 
042) medium for 1 hour at 37° G before being spread to LB (Luria-Bertani media 
(broth or agar)) plates supplemented with 100 )ig/ml ampicillin (LB/amp plates). 
These plates were incubated overnight at 37° G and resulting colonies picked to 
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LB/amp broth for overnight growth at 37° C. Plasmid DNA was isolated using a 
modified alkaline lysis protocol (Sambrook et al., 1989), electrophoresed on a 1% 
agarose gel, and visualized on a U.V. transilluminator after ethidium bromide 
staining. Colonies producing a plasmid of the expected size (approximately 6.4 kbp) 
5 were cultured in at least 250 ml of LB/amp broth and plasmid DNA harvested using a 
Qiagen Maxi-Prep Kit (column purification) according to the manufacturer's protocol 
(Qiagen, Inc., Chatsworth, CA). Column purified DNA was used as template for 
sequencing to verify the changes made in the transposase were the desired changes 
and no further changes or mutations occurred due to PCR amplification. For 

10 sequencing, Perkin-Elmer*s Big Dye Sequencing Kit was used. All samples were sent 
to the Gene Probes and Expression Laboratory (LSU School of Veterinary Medicine) 
for sequencing on a Perkin-Elmer Model 377 Automated Sequencer. 

Once a clone was identified that contained the desired mATS in the correct 
orientation, primers CMVf-NgoM IV (5' T TGCCGGCA TCAGATTGGCTAT (SEQ 

15 ID NO: 15); underlined bases denote NgoM IV recognition site) and Syn-polyA-BstE 
II (5' AG AGGTCACC GGGTCAATTCTTCAGCACCTGGTA (SEQ ID NO: 16); 
underlined bases denote BstE II recognition site) were used to PCR amplify the entire 
CMV promoter, mATS, and synthetic polyA for cloning upstream of the transposon 
in pTnLac. The PCR was conducted vnth FailSafe'^^ as described above, purified 

20 using the Zymo Clean and Concentrator kit, the ends digested with NgoM IV and 
BstE II (New England Biolabs), purified with the Zymo kit again and cloned upstream 
of the transposon in pTnLac as described below. 

Plasmid pTnLac was digested with NgoM IV and BstE n to remove the ptac 
promoter and transposase and the firagments separated on an agarose gel. The band 

25 corresponding to the vector and transposon was excised, purified from the agarose, 
and dephosphorylated with calf intestinal alkaline phosphatase (New England 
Biolabs) to prevent self-annealing. The enzyme was removed from die vector using a 
Zymo DNA Clean and Concentrator-5. The purified vector and CMVp/mATS/polyA 
were ligated together using a Stratagene T4 Ligase Kit and transformed into E. coli as 

30 described above. 

Colonies resulting from this transformation were screened (mini-preps) as 
describe above and clones that were the correct size were verified by DNA sequence 
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analysis as described above: The vector was given the name pTnMod (SEQ ID N0:1) 
and includes the following components: 

Base pairs 1-130 are a remainder of Fl(-) on from pBluescriptll sk(-) 
(Stratagene), corresponding to base pairs 1-130 of pBluescriptll sk(-). 
S Base pairs 131 - 132 are a residue from ligation of restriction enzyme sites 

used in constructing the vector. 

Base pairs 133 -1777 are the CMV promoter/enhancer taken from vector 
pGWiz (Gene Therapy Systems), corresponding to bp 229-1873 of pGWiz, The 
CMV promoter was modified by the addition of an ACC sequence upstream of ATG. 
10 Base pairs 1778-1779 are a residue from ligation of restriction enzyme sites 

used in constructing the vector. 

Base pairs 1780 - 2987 are the codmg sequence for the transposase, modified 
from TnlO (GenBank accession JO 1829) by optimizing codons for stability of the 
transposase mRNA and for the expression of protein. More specifically, in each of the 
1 5 codons for the first ten amino acids of the transposase, G or C was changed to A or T 
when such a substitution would not alter the amino acid that was encoded. 

Base pairs 2988-2993 are two engineered stop codons. 

Base pair 2994 is a residue from ligation of restriction enzyme sites used in 
constructing the vector. 
20 Base pairs 2995 - 3410 are a synthetic polyA sequence taken from the pGWiz 

vector (Gene Therapy Systems), corresponding to bp 1922-2337 of 10 pGWiz. 

Base pairs 3415 - 3718 are non-coding DNA that is residual from vector 
pNK2859. 

Base pairs 3719 - 3761 are non-coding X DNA that is residual from pNK2859. 
25 Base pairs 3762 - 383 1 are the 70 bp of the left insertion sequence recognized 

by the transposon TnlO. 

Base pairs 3832-3837 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 

Base pairs 3838 - 4527 are the multiple cloning site from pBluescriptll sk(20), 
30 corresponding to bp 924-235 of pBluescriptil sk(-). This multiple cloning site may be 
used to insert any coding sequence of interest into the vector. 

Base pairs 4528-4532 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 
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Base pairs 4533 - 4602 are tiie 70 bp of the right insertion sequence • * 
recognized by the transposon TnlO. 

Base pairs 4603 - 4644 are non-coding X DNA that is residual from pNK2859. 
Base pairs 4645 - 5488 are non-coding DNA that is residua! from pNK2859. 
5 Base pairs 5489 - 7689 are from the pBluescriptll sk(-) base vector - 

(Stratagene, Inc.), corresponding to bp 761-2961 of pBluescriptll sk(-). 

Completing pTnMod is a pBlueScript backbone that contains a colE I origin of 
replication and an antibiotic resistance marker (ampicillin). 

It should be noted that all non-coding DNA sequences described above can be 
10 replaced with any other non-coding DNA sequence(s). Missing nucleotide sequences 
in the above construct represent restriction site remnants. 

All plasmid DNA was isolated by standard procedures. Briefly, Escherichia 
coli containing the plasmid was grown in 500 mL aliquots of LB broth (supplemented 
with an appropriate antibiotic) at 37°C overnight with shaking. Plasmid DNA was 
15 recovered from the bacteria using a Qiagen Maxi-Prep kit (Qiagen, Inc., Chatsworth, 
CA) according to the manufacturer's protocol. Plasmid DNA was resuspended in 500 
of PCR-grade water and stored at -ZC^C until used. 



EXAMPLE 2 

20 Preparation of Transposon-Based Vector pTnMod (CMV/Red) 

A vector was designed for inserting a reporter gene (DsRed) under the control 
of the CMV promoter into the genome of vertebrate cells given below as SEQ ID 
N0:2. The reporter gene chosen was the DsRed gene, driven by the immediate early 
cytomegalovirus promoter, to produce a plasmid called pTnCMV/DsRed. The DsRed 

25 gene product is a red fluorescent protein from an IndoPacific sea anemone, 
Discosoma sp., which fluoresces bright red at 558 nm. It is to be understood that the 
reporter gene, i.e., the DsRed gene, is only one embodiment of the present invention 
and Aat any gene of interest may be inserted into the plasmid in place of the DsRed 
reporter gene in any Experiment described herein. 

30 The vector of SEQ ID N0:2, named pTnMod (CMV/Red), was constructed, 

and its sequence verified by re-sequencing. SEQ ID NO:2, pTnMod (CMV/Red), 
includes the following components: 
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Base pairs 1-130 are a remainder of Fl{-) on from pBluescriptll sk(-) 
(Stratagene), corresponding to bp 1-130 of pBluescriptll sk(-). 

Base pairs 131 - 132 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 
5 Base pairs 133 -1777 are the CMV promoter/enhancer taken from vector 

pGWiz (Gene Therapy Systems, corresponding to bp 229-1873 of pGWiz. 

Base pairs 1778-1779 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 

Base pairs 1780 - 2987 are the coding sequence for the transposase, modified 
1 0 from Tn 1 0 (GenBank accession JO 1 829) by optimizing codons as discussed above. 

Base pairs 2988-2993 are two engineered stop codons. 

Base pair 2994 is a residue from ligation of restriction enzyme sites used in 
constmcting the vector. 

Base pairs 2995 - 3410 are a synthetic polyA sequence taken from the pGWiz 
1 5 vector (Gene Therapy Systems), corresponding to bp 1 922-2337 of pGWiz. 

Base pairs 3415 - 3718 are non-coding DNA that is residual from vector 
pNK2859. 

Base pairs 3719 - 3761 are non-coding X DNA that is residual from pNK2859. 

Base pairs 3762 - 3831 are the 70 bp of the left insertion sequence recognized 
20 by the transposon Tn 1 0. 

Base pairs 3832-3837 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 

Base pairs 3838 - 4044 are part of the multiple cloning site from pBluescriptll 
sk(-), corresponding to bp 924-718 of pBluescriptll sk(-). 
25 Base pairs 4045-4048 are a residue from ligation of restriction enzyme sites 

used m constructing the vector. 

Base pairs 4049-5693 are the CMV promoter/enhancer, taken from vector 
pGWiz (Gene Therapy Systems), corresponding to bp 229-1873 of pGWiz. 

Base pairs 5694-5701 are a residue from ligation of restriction enzyme sites 
30 used in constructing the vector. 

Base pairs 5702 - 6617 are the DsRed reporter coding sequence, including 
polyA sequence, from pDsRedl.l (Clontech), corresponding to bp 77 - 992 of 
pDsRedl.l. 
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Base pairs 6618-7101 are part of the multiple cloning site from pBluescriptll 
sk(-), corresponding to bp 718-235 of pBluescriptll sk(-). 

Base pairs 7102-7106 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 
5 Base pairs 7107 - 7176 are the 70 bp of the right insertion sequence 

recognized by the transposon TnlO. 

Base pairs 7177 - 7218 are non-coding X DNA that is residual from pNK2859. 

Base pairs 7219 - 8062 are non-coding DNA that is residual from pNK2859. 

Base pairs 8063 - 10263 are from the pBluescriptll sk(-) base vector 
1 0 (Stratagene, Inc.), corresponding to bp 76 1 -296 1 of pBluescriptll sk(-). 

It should be noted that all non-coding DNA sequences described above can be 
replaced with any other non-coding DNA sequence(s). 



EXAMPLE 3 

1 5 Preparation of Transposon-Based Vector pTnMod (Oval/Red) - Chicken 

A vector was designed for inserting a reporter gene (DsRed) under the control 
of the ovalbumin promoter, and including the ovalbumin signal sequence, into the 
genome of a bird. One version of this vector is given below as SEQ ID N0:3. The 
vector of SEQ ID N0:3, named pTnMod (Oval/Red) - Chicken, includes chicken 
20 ovalbumin promoter and signal sequences. 

SEQ ID N0:3, pTnMod (Oval/Red) - Chicken, includes the following 
components: 

Base pairs 1-130 are a remainder of Fl(-) on from pBluescriptll sk(-) 
(Stratagene), corresponding to bp 1-130 of pBluescriptil sk(-). 
25 Base pairs 131 - 132 are a residue from ligation of restriction enzyme sites 

used in constructing the vector. 

Base pairs 133 -1777 are fte CMV promoter/enhancer taken ftovn vector 
pGWiz (Gene Therapy Systems, corresponding to bp 229-1873 of pGWiz. 

Base pairs 1778-1779 are a residue from ligation of restriction enzyme sites 
30 used in constructing the vector. 

Base pairs 1780 - 2987 are die coding sequence for the transposase, modified 
from TnlO (GenBank accession JO 1829) by optimizing codons as discussed above. 

Base pairs 2988-2993 are two engineered stop codons. 
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Base pair 2994 is a residue from ligation of restriction enzyme sites used in 
constructing the vector. 

Base pairs 2995 - 3410 are a synthetic polyA sequence taken from the pGWiz 
vector (Gene Therapy Systems), corresponding to bp 1922-2337 of pGWiz. 
5 Base pairs 3415 -3718 are non-coding DNA that is residual from vector 

pNK2859. 

Base pairs 3719 - 3761 are non-coding X DNA that is residual from 10 
pNK2859. 

Base pairs 3762 - 3831 are the 70 bp of the left insertion sequence recognized 
10 by the transposon Tn 1 0. 

Base pairs 3832-3837 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 

Base pairs 3838 - 4044 are part of the multiple cloning site from pBluescriptll 
sk(-), corresponding to bp 924-718 of pBluescriptll sk(-). 
15 Base pairs 4045-4049 are a residue from ligation of restriction enzyme sites 

used in constructing the vector. 

Base pairs 4050 - 4951 contain upstream elements of the (including SDRE, 
steroid-dependent response element). See GenBank accession number J00895 
M24999, bp 43 M 332. Base pairs 4952-4959 are a residue from ligation of 
20 . restriction enzyme sites used in constructing the vector* 

Base pairs 4960 - 5112 are the chicken ovalbumin signal sequence (GenBank 
accession number J00895 M24999, bp 2996-3148). 

Base pairs 5113-5118 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 
25 Base pairs 5119 - 6011 are the DsRed reporter coding sequence, including 

polyA sequence, from pDsRedl.l (Clontech), corresponding to bp 100 - 992 of 
pDsRedl.l. 

Base pairs 6012-6017 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 
30 Base pairs 6018 - 6056 are part of the multiple cloning site of the ZeroBlunt 

Topo cloning vector (Invitrogen), corresponding to bp 337-377 of ZeroBlunt 

Base pairs 6057-6062 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 
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Base pairs 6063 - 6495 are part of the multiple cloning site frorh pBluescriptll 
sk(-), corresponding to bp 667-235 of pBluescriptll sk{-)- 

Base pairs 6496-6500 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 
5 Base pairs 6501 - 6570 are the 70 bp of the right insertion sequence 

recognized by the transposon TnlO. 

Base pairs 6571 - 6612 are non-coding X DNA that is residual from pNK28S9. 

Base pairs 6613 - 7477 are non-coding DNA that is residual from pNK2859. 

Base pairs 7478 - 9678 are from the pBluescriptll sk(-) base vector 
10 (Stratagene, Inc.), corresponding to bp 761-2961 of pBluescriptll sk(-). 

It should be noted that all non-coding DNA sequences described above can be 
replaced with any other non-coding DNA sequence(s). 



EXAMPLE 4 

1 5 Preparation of Transposon-Based Vector pTnMod(Oval/Red) - Quail 

A vector was designed for inserting a reporter gene (DsRed) under the control 
of the ovalbumin promoter, and including the ovalbumin signal sequence, into the 
genome of a bird given below as SEQ ID NO:4. The vector of SEQ ID N0:4, named 
pTnMod (Oval/Red) * Quail, has been constructed, and selected portions of the 
20 sequence have been verified by re-rsequencing. 

SEQ ID N0:4, pTnMod (Oval/Red) - Quail, includes the following 
components: 

Base pairs 1-130 are a remainder of Fl(-} on from pBluescriptll sk(-) 
(Stratagene), corresponding to bp 1-130 of pBluescriptll sk(-). 
25 Base pairs 131 - 132 are a residue from ligation of restriction enzyme sites 

used in constructing the vector. 

Base pairs 133 - 1777 are the CMV promoter/enhancer taken from vector 

« a 

pGWiz (Gene Therapy Systems), corresponding to bp 229-1 873 of pGWiz. 

Base pairs 1778-1779 are a residue from ligation of restriction enzyme sites 
30 used in constructing the vector. 

Base pairs 1780 - 2987 are the coding sequence for the transposase, modified 
from TnlO (GenBank accession JO 1829) by optimizing codons as discussed above. 
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Base pairs 2988-2993 are two engineered stop codons. Base pair 2994 is a 
residue from ligation of restriction enzyme sites used in constructing the vector. 

Base pairs 2995 - 3410 are a synthetic polyA sequence taken from the.pGWiz 
vector (Gene Therapy Systems), corresponding to bp 1922-2337 of pGWiz. 
5 Base pairs 3415 - 3718 are non-coding DNA that is residual from vector 

pNK2859. 

Base pairs 3719 - 3761 are non-coding X DNA that is residual from pNK28S9. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence 
recognized by the transposon TnlO. 
10 Base pairs 3832-3837 are a residue from ligation of restriction enzyme sites 

used in constructing the vector. 

Base pairs 3838 - 4044 are part of the multiple cloning site from pBluescriptU 
sk(-), corresponding to bp 924-718 of pBluescriptU sk(-). 

Base pairs 4045-4049 are a residue from ligation of restriction enzyme sites 
15 used in constructing the vector. 

Base pairs 4050 - 4934 are the Japanese quail ovalbumin promoter (including 
SDRE, steroid-dependent response element). The Japanese quail ovalbumin promoter 
was isolated by its high degree of homology to the chicken ovalbumin promoter 
(GenBank accession number J00895 M24999, base pairs 431-1332). Some deletions 
20 were noted in the quail sequence, as compared to the chicken sequence. 

Base pairs 4935-4942 are a residue from Ugation of restriction enzyme sites 
used in constructing the vector. 

Base pairs 4943 - 5092 are the Japanese quail ovalbumin signal sequence. The 
quail signal sequence was isolated by its high degree of homology to the chicken 
25 signal sequence (GenBank accession number J00895 M24999, base pairs 2996-3 1 48). 
Some deletions were noted in the quail sequence, as compared to the chicken 
sequence. 

Base pairs 5093-5098 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 
30 Base pairs 5099 - 5991 are the DsRed reporter coding sequence, including 

polyA sequence, from pDsRedl.l (Clontech), corresponding to bp 100 - 992 of 
pDsRed 1.1. 
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Base pairs 5992-5997 are a residue from ligation of restriction enzyme sites 
used in constructing the vector. 

. Base pairs 5998 - 6036 are part of the multiple cloning site of the ZeroBlunt 
Topo cloning vector (Invitrogen), corresponding to base pairs 337-377 of ZeroBlunt. 
5 Base pairs 6037-6042 are a residue from ligation of restriction enzyme sites 

used in constructing the vector. 

Base pairs 6043 • 6475 are part of the multiple cloning site from pBluescriptll 
sk(-), corresponding to bp 667-235 of pBluescriptll sk(-). 

Base pairs 6476-6480 are a residue from ligation of restriction enzyme sites 
1 0 used in constructing the vector. 

Base pairs 6481 - 6550 are the 70 bp of the right insertion sequence 
recognized by the transposon TnlO. 

Base pairs 6551 - 6592 are non-coding X, DNA that is residual from pNK2859. 
Base pairs 6593 - 7457 are non-coding DNA that is residual from pNK2859, 
15 Base pairs 7458 - 9658 are from the pBluescriptll sk(-) base vector 

(Stratagene. Inc.), corresponding to base pairs 761-2961 of pBluescriptll sk(-). 

It should be noted that all non-coding DNA sequences described above can be 
replaced with any other non-coding DNA sequence(s). 

20 EXAMPLES 

Transfection of Stage X Japanese Quail Eggs with pTnMod(Oval/Red) - Quail via 
embryo injection 

Transgenic Japanese quail were produced by transfecting Stage X embryos 
and the heritability of the transgene delivered by embryo transfection was established. 

25 More specifically, fertile eggs were collected in the morning and placed at 15° C until 
enough were collected for injection, but were held no longer than 7 days. Stage X 
embryos (eggs) were assigned to one of two treatment groups. Before treatment, each 
egg was incubated on its side at room temperature for about 2 hours to allow the 
embryo to move to "top dead center" (TDC). Each egg was transfected by drilling a 1 

30 mm hole (directly above the embryo) through the shell without penetrating the 
underlying shell membrane. A 0.5 ml syringe fitted with a 28 gauge needle was used 
to deliver DNA complexed to a transfecting reagent, i.e. SUPERFECT®, in a 50 ^il 
volume. An adhesive disc was used to seal the hole and provide a label for treatment 
identification. After all eggs were transfected, they were set in an incubator with the 

35 adhesive disc pointing upward for hatching. 
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Each bird that hatched was bled at one week of age, DNA was extracted from 
blood cells, and PCR was conducted using 28s primers as a positive control and 
primers specific to DsRed. Any bird that was negative was terminated, while positive 
birds were monitored to determine maintenance of the transgene. Birds consistently 
S positive were maintained until sexual maturity and bred. Positive male and female 
birds were mated. The eggs of mated hens were hatched and the resulting chicks, the 
Gl generation, were evaluated to determine if they were transgenic. All Gls resulting 
from this mating were bled and PCR conducted as described above. 

Egg injection: Two treatment groups and one control group were used for this 

10 experiment. Vector pTnMod (Oval/Red) in supercoiled form (Treatment I) and in 
linear form (Treatment 2) were used to transfect 15 eggs per treatment. To obtain 
linear DNA for this experiment, pTnMod (Oval/Red) was digested with NgoM IV, 
column purified, and resuspended in TE buffer. 

Each egg was injected with 0.75 \ig of DNA compiexed with SUPERFECT® 

15 in a 1:3 ratio in a total injection volume of 50 ^1 Hank's Balanced Salt solution 
(HBSS) was used to bring the volume to 50 \il The DNA Superfect mixture must be 
allowed to incubate (for complex formation) at room temperature for 10 minutes prior 
to injection and must be used within 40 minutes post initial mixing. Eggs were 
incubated as described above after injection. 

20 Results: In the supercoiled injection group, 2 females and 1 male were 

identified as PCR positive using primers specific to the DsRed coding sequence. 
These birds were mated as described above. Blood was taken from the Gl chicks and 
PCR was conducted. The results showed that the transgene was incorporated into the 
gametes of these birds. The Gl chicks from these birds were examined on a weekly 

25 basis until it was verified that the gene was not present or enough transgenic Gls were 
obtained to initiate a breeding flock of fully transgenic birds. Eggs from these Gl 
chicks expressed DsRed protein in the albumin portion of fheir eggs. 

EXAMPLE 6 

30 Intratesticular Injection of Chickens wth pTnMod(CMV/Red) (SEQ ID NO:2) 

Immature birds of different ages (4, 6, 8, 10, 12, and 14 weeks) were placed 
under anesthesia and injected in the testes with the construct pTnMod(CMV/Red). A 
saline solution containing 1-5 fig of purified DNA vector, mixed with SUPERFECT® 
transfecting reagent (Qiagen, Valencia, CA) in a 1:6 (wt:voI) ratio. The volume of 

35 saline was adjusted so that the total volume injected into each testis was 150-200 ^l, 

depending on the age and size of the bird. For the 4- and 6-week-old chickens, 1 ^g 

DNA in 150 )xl was injected in each testis, divided into three doses of 50 \i\ each. For 
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the older birds, 200 |al total volume* was injected, containing either 3 ^ig DNA (for 8- 
week-old birds) or 5 |ig DNA (for older birds) per testis. First, one testis was 
. surgically exposed prior to injection. After injection, the incision was sutured, and the 
sequence was repeated for the alternate testis. 
5 From six to nine months post-surgery, weekly sperm samples were taken from 

each injected bird, as well as from control birds. Each sperm sample was evaluated for 
uptake and expression of the injected gene. Samples were evaluated by PCR on whole 
sperm, within one week after collection. 

Approximately 100 male white leghorn chickens, in groups of 5-26, at ages 4, 

10 6, 8, 10, 12, and 14 weeks, were used as this is the age range in which it is expected 
that the testes are likely to be most "receptive." In this age range, the blood/testis 
barrier has not yet formed, and there is a relatively high number of spermatogonia 
relative to the numbers of other cell types, e.g., spermatids, etc. See J. Kumaran et al., 
1949. Poultry Sci., vol. 29, pp. 511-520. See also E. Oakberg, 1956. Am. J. Anatomy, 

15 vol. 99, pp. 507-515; and P. Kluin et al., 1984. Anat, EmbryoL, vol. 169, pp. 73-78. 

The experimental and control males were obtained from commercial sources 
at one day of age, and maintained in brooders until used. The male birds were housed 
in temperature-controlled spaces in individual standard caging as they approached 
maturity. They were given water and standard commercial feed ad lib. They were 

20 kept initially in a 23:1 hour light/dark cycle, stepped down at approximately weekly 
intervals to a 1 5:8 hour light/dark cycle, as this regimen has been reported to optimize 
sexual maturity and fertility. 
Surgical and DNA Injection Procedures 

At the appropriate ages, groups of individual males were starved overnight and 

25 then subjected to transgene delivery by direct intratesticular injection of DNA by 
experienced animal surgeons. Each male was anesthetized with isoflurane via a 
simplified gas machine. 

Various devices and anesthesia machines have previously been described for 
administering isoflurane (and other gaseous anesthetics) to birds. See Alsage et al., 

30 Poultry Sci,. 50:1876-1878 (1971); Greenlees et al. Am. J. Vet. Res., vol. 51, pp. 
757-758 (1990). However, these prior techniques are somewhat cumbersome and 
complex to implement. A novel and much simpler system to administer isoflurane (or 
other gaseous) anesthesia was developed due to the deficiencies in the prior art, a 
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system that we found worked well on all ages of chicks. A standard nose cone was 
placed over the chick's head, similar to the system that has been used for decades to 
administer ether to mice. A plastic tube approximately 3.5 cm in diameter and 12 cm 
long was filled with cotton, into which was poured approximately 2 mL isoflurane 
5 (Abbott Laboratories, Chicago). The chick's head was placed partially into the 
cylinder, and was held in place there intermittently throughout the surgery as required 
to maintain the proper plane of anesthesia, without overdosing. 

Each anesthetized bird was positioned on its side on an animal board with 
cords tractioning the wings and feet to allow access to the testes area. The area was 

10 swabbed with 0.5% chlorhexidine, and a 2 cm dorsolateral incision was made in the 
skin over the testis (similar to the procedure commonly used for caponization). A 
small-animal retractor was used to spread the last two ribs, exposing the testis. The 
DNA solution was then mixed with SUPERFECT® (Qiagen) according to the 
manufacturer's protocol, approximately a 1:6 wt/vol ratio, to a final concentration of 

15 0.01 - 0.05 jig/^il. This resulted in 1 - 5 (ig total DNA (in a 150-200 ^1 volume) being 
injected into each testis, spread over three injection sites: one at each end of the testis, 
and one in the middle. 

The injection device was a standard 25 gauge, 1/2 inch (1.27 cm) hypodermic 
needle, attached to a 50, 100, or 200 pil syringe. Approximately 5 mm of the needle 

20 tip was bent at a 90 degree angle, to facilitate insertion into the testes. Approximately 
50 - 70 ^1 of the DNA-SUPERFECT® solution was injected into each of three sites 
per testis. The multiple injections were calculated to suffiise the DNA throughout the 
whole testis, the idea being to promote contact between DNA and spermatogonia as 
much as feasible. We estimated that our procedure resulted in the injection of about 

25 100,000 DNA molecules per spermatogonium. The construct used in these tests was 
a highly potent constitutive modified CMV promoter, operatively linked to the dsRed 
gene as. shown in SEQ ID N0:2. 

Following injection, the incision was closed in two layers with 4-0 absorbable 
suture, and then the contralateral testis was similarly exposed and injected. Following 

30 surgery, each bird was returned to its cage to recover. One hundred thirteen males 
were ultimately used in the experimental regimen to increase the overall likelihood of 
success, along with 4 control birds (16 weeks 20 old) subjected to sham surgery (with 
injections containing only the transfection reagent. 
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Evaiuarion of Birds 

Thus, a total of 1 13 white leghorn chickens were injected with the DNA vector 
in groups of 5-26 at varying ages. Fourteen birds were transfonned at 4 weeks, 23 
birds at 6 weeks; 26 birds at 8 weeks; 23 birds at 10 weeks; 5 birds at 12 weeks; and 
5 22 birds at 14 weeks. Sixteen birds died before they could be sampled, so to date, 97 
roosters have been sampled, plus the four controls. Birds were evaluated at 18-24 
weeks of age for (a) potential transformation in the sperm, and (b) successful testis 
transfection. Sperm samples were obtained from each rooster by manual 
manipulation using standard techniques. The sperm were washed, and their DNA was 

10 extracted following the techniques of G. Mann et al., 1993. J. Reprod. Pert., 99:505- 
12. The samples were then frozen until analyzed. Evaluation was conducted by PCR 
analysis to detect DNA integration into the sperm, or into any of the testicular cells. 
Additionally, selected testes were harvested at the end of the spenn sampling period. 
Of 97 birds tested, at least 22 showed probable positive results. Positive 

15 results were observed at all transforation ages, except for 4 weeks, which was not 
tested. At least two birds were confirmed positive by PCR of sperm, conducted four 
months after the initial injection. These results were transient in many cases, however 
since it was believed that the DsRed gene product used in these initial proof of 
concept experiments was toxic. Nevertheless, the positive PCR results presumptively 

20 demonstrated that the transgene was incorporated into spermatogonia (before 
puberty), and that it was carried in transgenic sperm. Such sperm could then transmit 
the gene to subsequent generations, resulting in the production of true, germ-line 
transgenic "founder" birds. 

To further confirm that the DNA had been incorporated into the sperm, and 

25 that contaminating vector was not being detected from other sources, it was confirmed 
through PCR on sperm of experimental birds, and on positive and negative controls 
that the sperm of the experimental birds lacked DNA encoding the tiansposase. The 
design of the preferred transposon-based vector is such that the sequence encoding the 
transposase is contained in the vector, but is not incoiporated into the transformed 

30 chromosome. Thus, presence of the exogenous coding sequence, coupled with 
absence of the transposase gene, is strong evidence for incorporation of the exogenous 
coding sequence, or transgene. 
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These results demonstrated proof of concept, as positive PGR results were 
obtained from the sperm of treated birds. Interpretation of these preliminary results 
was made more difficult by the fact that the modified CMV promoter used in the 
experiment was probably too "hot." As the DsRed product is not secreted from the 
5 cells, the product built up intracellularly to levels that were toxic, frequently killing 
the cells. Even this result, of course, means that the transformation was successfiil. 
The transgene could not have killed the cells otherwise. 

In order to resolve to the problem with toxicity of the DsRed gene product, 
experiments were conducted using a different reporter gene operably linked to the 
10 ovalbumin promoter, so that the transgene was expressed in the egg white. These 
experiments are provided in Examples 12- 15 below. 

EXAMPLE 7 

Transfection of Male White Leghorn Chickens Using the Vector pTnMod(Oval/Red) - 

1 5 Quail (SEQ ID N0:4) via Testicular Injections 

In frirther experiments conducted on leghorn chickens, it was demonstrated 
that chickens injected intratesticularly at 8, 10, 12, or 14 weeks of age, had, on 
average, approximately 40% positive sperm between 6 and 8 months after injection. 
In other experiments, successful transfection was achieved with chickens injected at 

20 13 weeks of age. 

Forty-nine white leghorn roosters approximately 8, 10, 12, or 14 weeks of age 
were obtained and housed. Birds were identified, wing banded, and assigned to a 
treatment group. If appropriate (based on testes size and vascularization), one testis 
was caponized and the entire DNA injection volume was delivered to the remaining 

25 testis. Thirty-two males received DNA injections of 5^g DNA/testis at a 1:3 ratio of 
DNA to SUPERFECT®. The remaining birds were used as controls. After injection, 
all birds were mated with at least 5 females and observed until sexual maturity and 
egg-laying began. All eggs collected prior to peak egg production (approximately 24 
weeks of age for the hens) were incubated and candled to determine embryo presence. 

30 Any embryos identified were incubated to hatch to extract DNA, PCR was conducted, 
and transgene presence was determined. 

Roosters positive for the pTnMod(Oval/Red) - Quail construct were kept to 
produce Fl offspring (eggs collected at peak production). Offspring from this hatch 
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were bled, DNA extracted from the blood, and PGR conducted using primers specific • 
for the DsRed gene. It was detemiined that 77% of the offspring were transgenic. 

EXAMPLE 8 

5 Transfection of Mature Male Japanese Quail using the vector pTnMod(Oval/Red) - 
Quail (SEQE>N0:4) via Testicular Injections 

Twelve sexually mature males (at approximately 13 weeks of age) underwent 
surgery for testicular injection as described above for chickens. At 21-28 days of age, 
the birds were identified, leg banded, debeaked, and separated based on sex. 
10 Injections comprised 5 |ig/testes of the vector in concentrations 1:3 or 1:10 for 
SUPERFECT® or a 1:1 ratio with Mirrus. The study consisted of 3 treatment groups 
with 5 males in the 1:3 DNA:SUPERFECT® group, 3 males in the 1:10 
DNA:SUPERFECT® group, and 4 males in the 1:1 Mirrus group. All surgeries were 
conducted in one day. 

15 Any unincorporated DNA was allowed to clear from the testes by holding the 

birds for 19 days before mating with females. At 15 weeks of age, 2 age-matched 
females were housed with each treated male. The presence of the transfected DNA 
was determined in the fertilized eggs during the second week of egg lay. The 
subsequent eggs collected from parents producing positively identified transgenic 

20 eggs were collected and stored until taken to hatch. 

PCR performed on flie sperm of quail injected at three months of age indicated 
successfril incorporation of the DsRed transgene into the quail sperm. 

EXAMPLE 9 

25 Transfection of Immature Male Japanese Quail using the vector pTnMod(Oval/Red) - 
Quail (SEQIDN0:4) via Testicular Injections 

Approximately 450 quail eggs were set and hatched. At 21-28 days of age, the 
birds were identified, wingbanded, debeaked, and separated based on sex. At 4 weeks 
of age, 65 male birds underwent surgery and testicular injections as described above. 

30 Injections comprised a control and 2 |ig/testes of the vector in varying concentrations 
(0, 1/3, 1/5, and 1/10) of three different transfection reagents: 1) SUPERFECT®, 2) 
Mirus/Panvera and 3) Dosper. The study comprised 1 3 treatment groups with 5 males 
per group. One transfection reagent was administered per day. 
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At 7 weeks of age, 2 age-matched females were housed with each treated 
male. The presence of the transfected DNA was determined in the fertilized eggs 
during the second week of egg lay. The subsequent eggs collected from parents 
producing positively identified transgenic eggs were collected and stored until taken 
5 to hatch. PCR performed on the sperm of quail injected at four and five weeks of age 
indicated successful incorporation of the DsRed transgene into the quail sperm. 

EXAMPLE 10 

Preparation of Transposon-Based Vector pTnMod(Oval/ENT TAG/p}46/PA) - 
10 Chicken 

A vector is designed for inserting a pi 46 gene under the control of a chicken 
ovalbumin promoter, and a ovalbumin gene including an ovalbumin signal sequence, 
into the genome of a bird given below as SEQ ID NO:29. 

Base pairs 1 - 130 are a remainder of Fl(-) ori of pBluescriptll sk(-) 
1 5 (Stratagene) corresponding to base pairs 1 - 1 30 of pBluescriptll sk(-). 

Base pairs 133 - 1777 are a CMV promoter/enhancer taken from vector 
pGWiz (Gene Therapy Systems) corresponding to base pairs 229-1873 of pGWiz. 

Base pairs 1780 - 2987 are a transposase, modified from TnlO (GenBank 
accession number JO 1829). 
20 Base pairs 2988-2993 are an engineered stop codon. 

Base pairs 2995 - 3410 are a synthetic polyA firom pGWiz (Gene Therapy 
Systems) corresponding to base pairs 1922- 2337 of pGWiz. 

Base pairs 3415 - 3718 are non coding DNA that is residual from vector 
pNK2859. 

25 Base pairs 3719 - 3761 are \ DNA that is residual from pNK2859. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence 
(IS 10) recognized by the transposon TnlO. 

Base pairs 3838 - 4044 are a multiple cloning site from pBlueScriptll sk(-) 
corresponding to base pairs 924-718 of pBluescriptll sk(-). 
30 Base pairs 4050 - 4951 are a chicken ovalbumin promoter (including SDRE) 

that corresponds to base pairs 431-1332 of the chicken ovalbumin promoter in 
GenBank Accession Number J00895 M24999. 
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• ... . . • 4 . 

Base pairs 4958 - 6115 are a chicken ovalbumin signal sequeiice and 
Ovalbumin gene that correspond to base pairs 66-1223 of GenBank • Accession 
Number V00383.1 (The STOP codon being omitted). 

Base pairs 6122 - 6271 are a TAG sequence containing a gp41 hairpin loop 
5 from HIV 1, an enterokinase cleavage site and a spacer (synthetic). 

Base pairs 6272 - 6316 are a pi 46 sequence (synthetic) with 2 added stop 
codons. 

Base pairs 6324 - 6676 are a synthetic polyadenylation sequence from pGWiz 
(Gene Therapy Systems) corresponding to base pairs 1920 - 2272of pGWiz. 
10 Base pairs 6682 - 7114 are a multiple cloning site from pBlueScriptll sk(-) 

corresponding to base pairs 667-235 of pBluescriptll sk(-). 

Base pairs 7120- 7189 are the 70 base pairs of the right insertion sequence 
(IS 10) recognized by the transposon TnlO. 

Base pairs 7190-7231 are X DNA that is residual from pNK2859. 
1 5 Base pairs 7232 - 8096 are non coding DNA that is residual from pNK2859. 

Base pairs 8097 - 10297 are pBlueScript sk(-) base vector (Stratagene, Inc.) 
corresponding to base pairs 76 1-2961 of pBluescriptll sk(-). 

It should be noted that all non-coding DNA sequences described above can be 
20 replaced with any other non-coding DNA sequence(s). Missing nucleotide sequences 
in the above construct represent restriction site remnants. 



EXAMPLE 11 

Preparation ofTransposon-Based Vector pTnMod(Oval/ENT TAG/pl46/PA) - Quail 
25 A vector is designed for inserting a pl46 gene under the control of a quail 

ovalbumin promoter, and a ovalbumin gene including an ovalbumin signal sequence, 

into the genome of a bird given below as SEQ ID NO:30. 

Base pairs 1 - 130 are a remainder of Fl(-) on of pBluescriptll sk(-) 

(Stratagene) corresponding to base pairs 1-130 of pBluescripdl sk(-). 
30 Base pairs 133 - 1777 are a CMV promoter/enhancer taken from vector 

pGWiz (Gene Therapy Systems) corresponding to base pairs 229-1873 of pGWiz. 

Base pairs 1780 - 2987 are a transposase, modified from TnlO (GenBank 

accession number JO 1829). 

Base pairs 2988-2993 are an engineered stop codon. 
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Base pairs 2995 - 3410 are a synthetic polyA from pGWiz (Gene Therapy 
Systems) corresponding to base pairs 1922-2337 of pGWiz. 

Base pairs 3415 - 3718 are non coding DNA that is residual from vector 
pNK2859. 

5 Base pairs 3 7 1 9 - 376 1 are X DNA that is residual from pNK2859. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence 
(IS 10) recognized by the transposon TnlO. 

Base pairs 3838 - 4044 are a multiple cloning site from pBlueScriptll sk(-) 
corresponding to base pairs 924-718 of pBluescriptll sk(-). 
10 Base pairs 4050 - 4938 are the Japanese quail ovalbumin promoter (including 

SDRE, steroid-dependent response element). The Japanese quail ovalbumin promoter 
was isolated by its high degree of homology to the chicken ovalbumin promoter 
(GenBank accession number J00895 M24999, base pairs 431-1332). 

Bp 4945 - 6092 are a quail ovalbumin signal sequence and ovalbumin gene 
15 that corresponds to base pairs 54 - 1201 of GenBank accession number X53964.1. 
(The STOP codon being omitted). 

Base pairs 6097 - 6246 are a TAG sequence containing a gp41 hairpin loop 
from HIV I, an enterokinase cleavage site and a spacer (synthetic). 

Base pairs 6247 - 6291 are a pl46 sequence (synthetic) with 2 added stop 
20 codons. 

Base pairs 6299 - 665 1 are a synthetic polyadenylation sequence from pGWiz 
(Gene Therapy Systems) conesponding to base pairs 1920 - 2272of pGWiz. 

Base pairs 6657 - 7089 are a multiple cloning site from pBlueScriptn sk(-) 
corresponding to base pairs 667-235 of pBluescriptll sk(-). 
25 Base pairs 7095- 7164 are the 70 base pairs of the right insertion sequence 

(ISIO) recognized by the transposon TnlO. 

Base pairs 7165 - 7206 are X DNA that is residual from pNK2859. 
Base pairs 7207 - 8071 are non coding DNA that is residual from pNK28S9. 
Base pairs 8072 - 10272 are pBlueScript sk(-) base vector (Stratagene, Inc.) 
30 corresponding to base pairs 76 1 -296 1 of pBluescriptll sk(-). 
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It should be noted that all non-coding DNA sequences described above can be 
replaced with any other non-coding DNA sequence(s). Missing nucleotide sequences 
in the above construct represent restriction site remnants. 

5 EXAMPLE 12 

Preparation of Transposon-Based Vector pTnMod(Ovai/ENT TAG/Prolns/PA) - 
Chicken 

A vector is designed for inserting a proinsulin gene under the control of a 
chicken ovalbumin promoter, and a ovalbumin gene including an ovalbumin signal 
1 0 sequence, into the genome of a bird given below as SEQ ID N0:3 1 . 

Base pairs 1 - 130 are a remainder of Fl(-) on of pBluescriptU sk(-) 
(Stratagene) corresponding to base pairs 1-130 of pBluescriptll sk(-). 

Base pairs 133 - 1777 are a CMV promoter/enhancer taken from vector 
1 5 pG Wiz (Gene Therapy Systems) corresponding to base pairs 229- 1 873 of pG Wiz. 

Base pairs 1780 - 2987 are a transposase, modified from TnlO (GenBank 
accession number JO 1829). 

Base pairs 2988-2993 are an engineered stop codon. 

Base pairs 2995 - 3410 are a synthetic polyA from pGWiz (Gene Therapy 
20 Systems) corresponding to base pairs 1922- 2337 of pGWiz. 

Base pairs 341 S - 3718 are non coding DNA that is residual from vector 
pNK2859. 

Base pairs 3719 - 3761 are X DNA that is residual from pNK28S9. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence 
25 (IS 1 0) recognized by the transposon Tnl 0. 

Base pairs 3838 - 4044 are a multiple cloning site from pBlueScriptU sk(>) 
corresponding to base pairs 924-718 of pBluescriptll sk(-). 

Base pairs 4050 • 495 1 are a chicken ovalbumin promoter (including SDRE) 
that corresponds to base pairs 431-1332 of the chicken ovalbumin promoter in 
30 GenBank Accession Number J00895 M24999. 

Base pairs 4958 - 6115 are a chicken ovalbumin signal sequence and 
ovalbumin gene that correspond to base pairs 66-1223 of GenBank Accession 
Number V00383.1. (The STOP codon being omitted). 
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Base pairs 6122 - 6271 are a TAG sequence containing a gp41 hairpin loop 
from HIV I, an enterokinase cleavage site and a spacer (synthetic). 
Base pairs 6272 - 653 1 are a proinsulin gene. 

Base pairs 6539 - 6891 are a synthetic polyadenylation sequence from pGWiz 

■ 

5 (Gene Therapy Systems) corresponding to base pairs 1920 - 2272of pGWiz. 

Base pairs 6897 - 7329 are a multiple cloning site from pBlueScriptll sk(-) 
corresponding to base pairs 667-235 of pBluescriptll sk(-)- 

Base pairs 7335- 7404 are the 70 base pairs of the right insertion sequence 
(IS 10) recognized by the transposon TnlO. 
10 Base pairs 7405 - 7446 are X DNA that is residual from pNK2859. 

Base pairs 7447 - 831 1 are non coding DNA that is residual from pNK2859. 
Base pairs 8312 - 10512 are pBlueScript sk(-) base vector (Stratagene, Inc.) 
corresponding to base pairs 76 1-2961 of pBluescriptll sk(-). 

15 It should be noted that all non-coding DNA sequences described above can be 

replaced with any other non-coding DNA sequence(s). Missing nucleotide sequences 
in the above construct represent restriction site remnants. 

EXAMPLE 13 

20 Preparation of Transposon-Based Vector pTnMod(Oval/ENT TAG/Prolns/PA) - 
Quail 

A vector is designed for inserting a proinsulin gene under the control of a 
chicken ovalbumin promoter, and a ovalbumin gene including an ovalbumin signal 
sequence, into the genome of a bird given below as SEQ ID NO:32. 

25 

Base pairs 1 -130 are a remainder of Fl(-) on of pBluescriptll sk(-) 
(Stratagene) corresponding to base pairs 1-130 of pBluescriptll sk(-). 

Base pairs 133 - 1777 are a CMV promoter/enhancer taken from vector 
pGWiz (Gene Therapy Systems) corresponding to base pairs 229-1873 of pGWiz. 
30 Base pairs 1780 - 2987 are a transposase, modified from TnlO (GenBank 

accession number JO 1 829). 

Base pairs 2988-2993 are an engineered stop codon. 
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Base pairs 2995 - 34 1 0 are a synthetic polyA from pG Wiz (Gene Therapy . 
Systems) corresponding to base pairs 1922- 2337 of pGWiz. 

Base pairs 3415 - 3718 are non coding DNA that is residual from vector 
pNK2859. 

5 Base pairs 3719 -3761 are A. DNA that is residual from pNK2859. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence 
(IS 10) recognized by the transposon TnlO. 

Base pairs 3838 - 4044 are a multiple cloning site from pBlueScriptll sk(-) 
corresponding to base pairs 924-718 of pBIuescriptll sk(-). 
10 Base pairs 4050 - 4938 are the Japanese quail ovalbumin promoter (including 

SDRE, steroid-dependent response element). The Japanese quail ovalbumin promoter 
was isolated by its high degree of homology to the chicken ovalbumin promoter 
(GenBank accession number J00895 M24999, base pairs 431-1332). Some deletions 
were noted in the quail sequence, as compared to the chicken sequence. 
15 Base pairs 4945 - 6092 are a quail ovalbumin signal sequence and ovalbumin 

gene that corresponds to base pairs 54 - 1201 of GenBank accession number 
X53964. 1 . (The STOP codon being omitted). 

Base pairs 6093 - 6246 are a TAG sequence containing a gp41 hairpin loop 
from HTV I an enterokinase cleavage site and a spacer (synthetic). 
20 Base pairs 6247 - 6507 are a proinsulin gene. 

Base pairs 6514 - 6866 are a synthetic polyadenylation sequence from pGWiz 
(Gene Therapy Systems) corresponding to base pairs 1920 - 2272of pGWiz. 

Base pairs 6867 - 7303 are a multiple cloning site from pBlueScriptU sk(-) 
corresponding to base pairs 667-235 of pBIuescriptll sk(-). 
25 Base pairs 7304- 7379 are the 70 base pairs of the right insertion sequence 

(IS 10) recognized by the transposon TnlO. 

Base pairs 7380 - 7421 are X DNA that is residual from pNK2859. 
Base pairs 7422 - 8286 are non coding DNA that is residual from pNK2859. 
Base pairs 8287 - 10487 are pBlueScript sk(-) base vector (Stratagene, Inc.) 
30 corresponding to base pairs 761-2961of pBIuescriptll sk(-). 



77 

ATLUB02 133492.1 



wo 2004/003157 PCTAJS2003/020389 

It should be noted that all non-coding DNA sequences described above can be ' 
replaced with any other non-coding DNA sequence(s). Missing nucleotide sequences 
in the above construct represent restriction site remnants. 

5 EXAMPLE 14 

Transfection of Immature Leghorn Roosters using a Transpson-based Vector 

containing a Proinsulin Gene via Testicular Injections 

Vectors containing the elements Oval promoter/Oval gene/GP41 Enterokinase 

TAG/Proinsulin/Poly A (SEQ ID N0:31) and CMV promoter/Oval gene/GP41 
10 Enterokinase TAG/Proinsulin/Poly A (SEQ ID NO:42) were each injected into the 

testes of 1 1 week old white leghorn roosters. These birds were held under normal 

conditions until sexual maturity was reached. 

At the time of sexual maturity, each bird was handled and manipulated to 

obtain sperm. Sperm samples were collected in Hank's Buffered Salt Solution 
1 5 (HBSS) and stored at either -Kf C or 4^ C until needed. DNA was extracted from 

sperm using a MoBio Ultra Clean DNA Bloodspin Kit (MoBio laboratories, Solana 

Beach CA). Fifty microliters of sperm was used in the DNA extraction protocol and 

the purified genomic DNA eluted in 100 |il of water. In each PGR reaction, 

approximately 0.S - 0.75 ^g of genomic DNA was used with primers anchored in the 
20 entag-1 (5') and the synthetic polyA-2 (3'), which amplify a 685 bp fragment. Five of 

nine birds gave positive reactions for the presence of the appropriate vector construct. 

These birds were then mated with normal females. 

Birds that did not yield positive results with PGR on the sperm were 

sacrificed, their testes removed, and DNA extracted using an approximately 25 mg 
25 piece of tissue in a Qiagen DNEasy Tissue Kit; purified DNA was eluted in 200 ^1 

water and PGR conducted as described above. Two of these birds gave a very strong, 

positive PGR reaction. 

EXAMPLE 15 

30 Transfection of Japanese Quail using a Transposon-based Vector containing a 
Proinsulin Gene via Oviduct Injections 

Two experiments were conducted in Japanese quail using transpson-based 
vectors containing either Oval promoter/Oval gene/GP41 Enterokinase 
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TAG/Proinsulin/Poly A (SEQ ID N0:31) or CMV promoter/Qval gene/GP41 
Enterokinase TAG/Proinsulin/Poly A (SEQ ID NO:42). 

In the first experiment, the Oval promoter/Oval gene/GP41 Enterokinase 
TAG/Proinsulin/Poly A containing construct was injected into the oviduct of sexually 
5 mature quail; three hens received 5 ^g at a 1:3 Superfect ratio and three received 10 
p.g at a 1 :3 Superfect ratio. As of the writing of the present application, at least one 
bird that received 10 jxg of DNA was producing human proinsulin in egg white (other 
birds remain to be tested). This experiment indicates that 1) the DNA has been stable 
for at least 3 months; 2) protein levels are comparable to those observed with a 

10 constitutive promoter such as the CMV promoter; and 3) sexually mature birds can be 
injected and results obtained without the need for cell culture. 

In the second experiment, the transposon-based vector containing CMV 
promoter/Oval gene/GP4l Enterokinase TAG/Proinsulin/Poly A was injected into the 
oviduct of sexually immature Japanese quail. A total of 9 birds were injected. Of the 

15 8 survivors, 3 produced human proinsulin in the white of their eggs for over 6 weeks. 
An ELISA assay described in detail below was developed to detect GP41 in the ftision 
peptide (Oval gene/GP41 Enterokinase TAG/Proinsulin) since the GP41 peptide 
sequence is unique and not found as part of normal egg white protein. In all ELISA 
assays, the same birds produced positive results and all controls worked as expected. 

20 ELISA Procedure: Individual egg white samples were diluted in sodium 

carbonate buffer, pH 9.6, and added to individual wells of 96 well microtiter ELISA 
plates at a total volume of 0.1 ml. These plates were then allowed to coat overnight at 
4^C. Prior to ELISA development, the plates were allowed warm to room 
temperature. Upon decanting the coating solutions and blotting away any excess, 

25 non-specific binding of antibodies was blocked by adding a solution of phosphate 
buffered saline (PBS), 1% (w/v) BSA, and 0.05% (v/v) Tween 20 and allowing it to 
incubate with shaking for a minimum of 45 minutes. This blocking solution was 
subsequently decanted and replaced Avith a solution of the primary antibody (Goat 
Anti-GP41 TAG) diluted in fresh PBS/BSA/Tween 20. After a two hour period of 

30 incubation with the primary antibody, each plate was washed with a solution of PBS 
and 0.05% Tween 20 in an automated plate washer to remove unbound antibody. 
Next, the secondary antibody, Rabbit anti-Goat Alkaline Phosphatase-conjugated, was 
diluted in PBS/BSA/Tween 20 and allowed to incubate I hour. The plates were then 
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subjected to a second wash with PBS/Tween 20. Antigen was detected using a 
solution of j[7-Nitrophenyl Phosphate in Diethanolamine Substrate Buffer for Alkaline 
Phosphatase and measuring the absorbance at 30 minutes and 1 hour. 

5 EXAMPLE 16 

Optimization of Intra-oviduct and Intra-ovarian Arterial Injections 

Overall transfection rates of oviduct cells in a flock of chicken or quail hens 
are enhanced by synchronizing the development of the oviduct and ovary within the 
flock. When the development of the oviducts and ovaries are uniform across a group 
10 of hens and when the stage of oviduct and ovarian development can be determined or 
predicted, timing of injections is optimized to transfect the greatest number of cells. 
Accordingly, oviduct development is synchronized as described below to ensure that a 
large and uniform proportion of oviduct secretory cells are transfected with the gene 
of interest. 

15 Hens are treated with estradiol to stimulate oviduct maturation as described in 

Oka and Schimke (T. Oka and RT Schimke, J. Cell Biol., 41, 816 (1969)), Palmiter, 
Christensen and Schimke (J Biol. Chem. 245(4):833-845, 1970). Specifically, 
repeated daily injections of 1 mg estradiol benzoate are performed sometime before 
the onset of sexual maturation, a period ranging from 1-14 weeks of age. After a 

20 stimulation period sufficient to maximize development of the oviduct, hormone 
treatment is withdrawn thereby causing regression in oviduct secretory cell size but 
not cell number. At an optimum time after hormone withdrawal, the oviducts of 
treated hens are injected with the transposon-based vector. Hens are subjected to 
additional estrogen stimulation after an optimized time during which the transposon- 

25 based vector is taken up into oviduct secretory cells. Re-stimulation by estrogen 
activates the transposon mechanism of the transposon-based vector, causing the 
integration of the gene of interest into the host genome. Estrogen stimulation is then 
withdrawn and hens continue normal sexual development. If a developmentally 
regulated promoter such as the ovalbumin promoter is used, expression of the 

30 transposon-based vector initiates in the oviduct at the time of sexual maturation. 
Intra-ovarian artery injection during this window allows for high and uniform 
transfection efficiencies of ovarian follicles to produce germ-line transfections and 
possibly oviduct expression. 
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Other means are also used to synchronize the development, or regression, of 
the oviduct and ovary to allow high and uniform transfection efficiencies. Alterations 
of lighting and/or feed regimens, for example, cause hens to 'molt' during which time 
the oviduct and ovary regress. Molting is used to synchronize hens for transfection, 
5 and may be used in conjunction with other hormonal methods to control regression 
and/or development of the oviduct and ovaiy. 



EXAMPLE 17 

Isolation of Human Proinsulin Using Anti-TAG Column Chromatography 

10 A HiTrap NHS-activated 1 mL column (Amersham) was charged with a 30 

amino acid peptide that contained the gp-41 epitope containing gp-4rs native 
disulfide bond that stabilizes the formation of the gp-41 hairpin loop. The 30 amino 
acid gp41 peptide is provided as SEQ ID NO:23. Approximately 10 mg of the 
peptide was dissolved in coupling buffer (0.2 M NaHC03, 0.5 M NaCl, pH 8.3 and 

15 the ligand was circulated on the column for 2 hours at room temperature at 0.5 
mL/minute. Excess active groups were then deactivated using 6 column volumes of 
0.5 M ethanolamine, 0.5 M NaCI, pH 8.3 and the column was washed alternately with 
6 column volumes of acetate buffer (0.1 M acetate, 0.5 M NaCl, pH 4.0) and 
ethanolamine (above). The column was neutralized using 1 X PBS. The column was 

20 then washed with buffers to be used in affinity purification: 75 mM Tris, pH 8.0 and 
elution buffer, 100 mM glycine*HCl, 0.5 M NaCl, pH 2.7. Finally, the column was 
equilibrated in 75 mM Tris buffer, pH 8.0. 

Antibodies to gp-41 were raised in goats by inoculation with the gp-41 peptide 
described above. More specifically, goats were inoculated, given a booster injection 

25 of the gp-41 peptide and then bled. Serum was harvested by centrifugation. 
Approximately 30 mL of goat serum was filtered to 0.45 uM and passed over a TAG 
column at a rate of 0.5 mL/min. The column was washed wifli 75 mM Tris, pH 8.0 
until absorbance at 280 nm reached a baseline. Three column volumes (3 mL) of 
elution buffer (100 mM glycine, 0.5 M NaCl, pH 2.7) was applied, followed by 75 

30 mM Tris buffer, pH 8.0, all at a rate of 0.5 mL/min. One milliliter fractions were 
collected. Fractions were collected into 200 uL 1 M Tris, pH 9.0 to neutralize acidic 
factions as rapidly as possible. A large peak eluted from the colunm, coincident with 
the application the elution buffer. Fractions were pooled. Analysis by SDS-PAGE 

81 

ATLUB02 1334911 



wo 2004/003157 PCT/US2003/020389 

* * - ^ • 

showed a high molecurar weight species that separated into two fragments under 
reducing condition, in keeping with the heavy and light chain structure of IgG. 

Pooled antibody fractions were used to charge two 1 mL HiTrap NHS- 
activated columns, attached in series. Coupling was earned out in the same manner as 
5 that used for charging the TAG column. 

Isolation of Qvalbumin^TAG-Froinsulin from Egg White 

Egg white from quail and chickens treated by intra-oviduct injection of the 
CMV-ovalbumin-TAG-proinsulin construct were pooled. Viscosity was lowered by 
subjecting the allantoid fluid to successively fmer pore sizes using negative pressure 

10 filtration, finishing with a 0.22 pore size. Through the process, egg white was 
diluted approximately 1:16. The clarified sample was loaded on the Anti-TAG 
column and eluted in the same manner as described for the purification of the anti- 
TAG antibodies. A peak of absorbance at 280 nm, coincident with the application of 
the elution buffer, indicated that protein had been specifically eluted from the Anti- 

IS TAG column. Fractions containing the eluted peak were pooled for analysis. 

The pooled fractions from the Anti-TAG affinity column were characterized 
by SDS-PAGE and western blot analysis. SDS-PAGE of the pooled fractions revealed 
a 60 kDal molecular weight band not present in control egg white fluid, consistent 
with the predicted molecular weight of the transgenic protein. Although some 

20 contaminating bands were observed, the 60 kDal species was greatly enriched 
compared to the other proteins. Ah aliquot of the pooled fractions was cleaved 
overnight at room temperature with the protease, enterokinase. SDS-PAGE analysis 
of the cleavage product, revealed a band not present in the uncut material that co- 
migrated with a commercial human proinsulin positive control. Western blot analysis 

25 showed specific binding to the 60 kDal species under non-reducing condition (which 
preserve the haiqjin epitope of gp-41 by retaining the disulfide bond). Western 
analysis of the low molecular weight species that appeared upon cleavage with an 
anti-human proinsulin antibody, conclusively identified the cleaved fragment as 
human proinsulin. 

30 



82 

ATLL1BQ2 133492. 1 



wo 2004/003157 PCTAJS2003/020389 

EXAMPLE 18 * 
Construction of a Transposon-based Transgene for the Expression of a Monoclonal 
Antibody 

Production of a monoclonal antibody using transposon-based transgenic 
5 methodology is accomplished in a variety of ways. 

1) two vectors are constructed: one that encodes the light chain and a second vector 
that encodes the heavy chain of the monoclonal antibody. These vectors are then 
incorporated into the genome of the target animal by at least one of two methods: a) 
direct transfection of a single animal with both vectors (simultaneously or as separate 

10 events); or, b) a male and a female of the species carry in their germline one of the 
vectors and then they are mated to produce progeny that inherit a copy of each. 

2) the light and heavy chains are included on a single DNA construct, either separated 
by insulators and expression is governed by the same (or different) promoters, or by 
using a single promoter governing expression of both transgenes with the inclusion of 

1 S elements that permit separate transcription of both transgenes, such as an internal 
ribosome entry site. 

The following example describes the production of a transposon-based DNA 
construct that contains both the coding region for a monoclonal light chain and a 
heavy chain on a single construct. Beginning with the vector pTnMod, the coding 

20 sequences for the heavy and light chains are added, each preceded by an appropriate 
promoter and signal sequence. Using mietfaods known to one skilled in the art, 
approximately 1 Kb of the proximal elements of the ovalbumin promoter are linked to 
the signal sequence of ovalbumin or some other protein secreted from the target 
tissue. Two copies of the promoter and signal sequence are added to the multiple 

25 cloning site of pTnMod, leaving space and key restriction sites between them to allow 
the subsequent addition of the coding sequences of the light and heavy chains of the 
monoclonal antibody. Methods known to one skilled in the art allow the coding 
sequences of the light and heavy chains to be inserted in-frame for appropriate 
expression. For example, the coding sequence of light and heavy chains of a murine 

30 monoclonal antibody that show specificity for human seminoprotein have recently 
been disclosed (GenBank Accession numbers AY129006 and AY129304 for the light 
and heavy chains, respectively). The light chain cDNA sequence is provided in SEQ 
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ID NO:34, whereas the cDNA of the heavy chain is reported as provided in SEQ ID 
NO:35. 

Thus one skilled in the art can produce both the heavy and light chains of a 
monoclonal antibody in a single cell within a target tissue and species. If the modified 
5 cell contained normal posttranslational modification capabilities, the two chains 
would form their native configuration and disulfide attachments and be substrates for 
glycosylation. Upon secretion, then, the monoclonal antibody is accumulated, for 
example, in the egg white of a chicken egg, if the transgenes are expressed in the 
magnum of the oviduct. 

10 It should also be noted that, although this example details production of a fiill- 

length murine monoclonal antibody, the method is quite capable of producing hybrid 
antibodies (e.g. a combination of human and murine sequences; 'humanized' 
monoclonal antibodies), as well as useful antibody firagments, known to one skilled in 
the art, such as Fab, Fc, F(ab) and Fv fragments. This method can be used to produce 

15 molecules containing the specific areas thought to be the antigen recognition 
sequences of antibodies (complementarity determining regions), linked, modified or 
incorporated into other proteins as desired. 

EXAMPLE 19 

20 Treatment of rats with a transposon-based vector for tissue-specific insulin gene 
incorporation 

Rats are made diabetic by administering the drug streptozotocin (Zanosar, 
Upjohn, Kalamazoo, MI) at approximately 200 mg/kg. The rats are bred and 
maintained according to standard procedures. A transposon-based vector containing a 

25 proinsulin gene, an appropriate carrier, and, optionally, a transfection agent, are 
injected into rats' singhepatic (if using G6P) artery with the purpose of stable 
transformation. Incorporation of the insulin gene into the rat genome and levels of 
insulin expression are ascertained by a variety of methods known in the art. Blood 
and tissue samples firom live or sacrificed animals are tested. A combination of PGR, 

30 Southern and Northern blots, in-situ hybridization and related nucleic acid analysis 
methods are used to determine incorporation of the vector-derived proinsulin DNA 
and levels of transcription of the corresponding mRNA in various organs and tissues 
of the rats. A combination of SDS-PAGE gels, Western Blot analysis, 
radioimmunoassay, and ELISA and other methods known to one of ordinary skill in 
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the aft are used to detennine the presence of insulin and the amount ptoduced; 
Additional transfections of the vector are used to increase protein expression if the 
initial amounts of the expressed insulin are not satisfactory, or if the level of 
expression tapers off. The physiological condition of the rats is closely examined 
5 post-transfection to register positive or any negative effects of the gene therapy. 
Animals are examined over extended periods of time post-transfection in order to 
monitor the stability of gene incorporation and protein expression. 

EXAMPLE 20 

10 

Exemplary Transposon-Based Vectors 

The following example provides a description of various transposon-based 
vectors of the present invention and several constructs for insertion into the 
transposon-based vectors of the present invention. These examples are not meant to 
15 be limiting in any way. The constructs for insertion into a transposon-based vector 
are provided in a cloning vector labeled pTnMCS. 

pTnMCS fbase vector) 

Bp 1 - 130 Remainder of Fl (-) ori of pBluescriptll sk(-) (Stratagene) bpl-130 
20 Bp 133 - 1777 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy 
Systems) bp2 29-1873 

Bp 1783 - 2991 Transposase, from TnlO (GenBank accession #101829) bp 108-1316 
Bp 2992 - 3344 Non coding DNA from vector pNK28S9 
Bp 3345 - 3387 Lambda DNA from pNK2859 
25 Bp 3388 - 3457 70 bp of IS 10 left from TnlO 

Bp 3464 - 3670 Multiple cloning site from pBluescriptll sk(-), thru the Xmal site 
bp924-718 

Bp 3671 - 3715 Multiple cloning site from pBluescriptll sk(-), from the Xmal site 
thru die Xhol site. These base pairs are usually lost when cloning into pTnMCS.bp 
30 717-673 

Bp 3716 - 4153 Multiple cloning site from pBluescriptll sk(-), from the Xhoi site 
bp672-235 

Bp 4159 - 4228 70 bp of ISIO left from TnlO 

Bp 4229 - 4270 Lambda DNA from pNK2859 
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Bp 427 1 - 5 1 1 4 Non-coding DNA from pNK2859 

Bp 5 1 1 5 - 73 1 5 pBluescript sk (-) base vector (Stratagene, Inc.) bp 76 1 -296 1 



pTnMCS (CMV-prepro-ent-hGH-CPA) 
5 Bp 1 - 3670 from vector PTnMCS, bp 1 - 3670 

Bp 3676 - 5320 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy 
Systems), bp 230-1864 

Bp 5326 - 5496 Capsite/Prepro taken from GenBank accession # X07404, bp 563 - 
733 

10 Bp 5504 - 5652 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 

Bp 5653 - 6306 Human growth hormone taken from GenBank accession # V00519, 
bp 1-654 

Bp 6313 - 6720 Conalbumin polyA taken from GenBank accession # Y00407, bp 
15 10651-11058 

Bp 6722 -10321 from cloning vector pTnMCS, bp 37 16-73 15 

PTnMCS fCMV-CHOVg-ent-ProInsulin-svnPAl (SEP ID N0:4n 
Bp 1 - 3670 from vector PTnMCS, bp 1 - 3670 
20 Bp 3676 - 5320 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy 
Systems), bp 230-1864 

Bp 5327 -6480 Chicken ovalbumin gene taken from GenBank accession # V00383, 
bp 66-1219 

Bp 6487 - 6636 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
25 added enterokinase cleavage site 

Bp 6637 - 6897 Human Proinsulin taken from GenBank accession # NM000207, bp 
117-377 

Bp 6898 - 6942 Spacer DNA, derived as an arti&ct from the cloning vectors pTOPO 
Blunt II (Invitrogen) and pGWIZ (Gene Therapy Systems) 
30 Bp 6943 - 7295 Synthetic polyA from the cloning vector pGWIZ (Gene Therapy 
Systems), bp 1920-2271 

Bp 7296 - 10895 from cloning vector pTnMCS, bp 3716-7315 
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pTriMCS (CMV-preprorent-ProInsulin>svnPA) 
Bp 1 - 3670 from vector PTnMCS, bp 1 - 3670 

Bp 3676 - 5320 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy 
Systems), bp 230-1864 
5 Bp 5326 - 5496 Capsite/Prepro taken fron GenBank accession # X07404, bp 563 - 
733 

Bp 5504 - 5652 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 

Bp 5653 - 5913 Human Proinsulin taken from GenBank accession # NM000207, bp 
10 117-377 

Bp 5914 - 5958 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
Blunt II (Invitrogen) and pGWIZ (Gene Therapy Systems) 

Bp 5959 - 6310 Synthetic polyA from the cloning vector pGWIZ (Gene Therapy 
Systems), bp 1920-2271 
15 Bp 63 1 3 - 99 1 2 from cloning vector pTnMCS, bp 37 1 6-73 1 5 

pTnMCS(Chicken OVep-HDVg^+ENT+proins+svn polvA) 
Bp 1 - 3670 from vector pTnMCS, bp 1 - 3670 

Bp 3676 - 4350 Chicken Ovalbumin enhancer taken from GenBank accession 
20 #882527.1 bp 1-675 

Bp 4357 - 5692 Chicken Ovalbumin promoter taken from GenBank accession # 
J00895M24999 bp 1-1336 

Bp 5699 - 6917 Chicken Ovalbumin gene from GenBank Accession # V00383.1 bp 
2-1220. (This sequence includes the 5'UTR, containing putative cap site, bp 5699- 
25 5762.) 

Bp 6924 - 7073 Synthetic spacer sequence and hairpin loop of HIV gp4l with an 
added enterokinase cleavage site 

Bp 7074 « 7334 Human proinsulin GenBank Accession # NM000207 bp 11 7-377 
Bp 7335 - 7379 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
30 Blunt n (Invitrogen) and gWIZ (Gene Therapy Systems) 

Bp 7380 - 7731 Synthetic polyA from the cloning vector gWIZ (Gene Therapy 
Systems) bp 1920-2271 

Bp 7733 - 1 1332 from vector pTnMCS, bp 3716 - 7315 
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pTnMCSfChicken OVep-Hprepro+ENT+proins+svn polvA) 
Bp 1 - 3670 from cloning vector pTnMGS, bp 1 - 3670 

Bp 3676 - 4350 Chicken Ovalbumin enhancer taken from GenBank accession # 
5 S82527.1 bp 1-675 

Bp 4357 - 5692 Chicken Ovalbumin promoter taken from GenBank accession # 
J00895-M24999 bp 1-1336 

Bp 5699 - 5869 Cecropin cap site and Prepro, Genbank accession U X07404 bp 563- 

733 

10 Bp 5876 - 6025 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 

Bp 6026 - 6286 Human proinsulin GenBank Accession # NM000207 bp 1 1 7-377 
Bp 6287 - 6331 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
Blunt II (Invitrogen) and gWIZ (Gene Therapy Systems) 
15 Bp 6332 - 6683 Synthetic polyA from the cloning vector gWIZ (Gene Therapy 
Systems) bp 1920-2271 

Bp 6685 - 10284 from cloning vector pTnMCS, bp 3716 - 7315 

pTnMCSrOuail OVeD+OVg'+ENT+proins+svn polvA) 

20 Bp 1 - 3670 from cloning vector pTnMCS, bp 1 - 3670 

Bp 3676 - 4333 Quail Ovalbumin enhancer: 658 bp sequence, amplified in-house 
from quail genomic DNA, roughly equivalent to the far-upstream chicken ovalbumin 
enhancer, GenBank accession # S82527.1, bp 1-675. (There are multiple base pair 
substitutions and deletions in the quail sequence, relative tochicken, so the number of 

25 bases does not correspond exactly.) 

Bp 4340 - 5705 Quail Ovalbumin promoter: 1366 bp sequence, amplified in-house 
frx)m quail genomic DNA, roughly corresponding to chicken ovalbumin promoter, 
GenBank accession # J00895-M24999 bp 1-1336. (There are multiple base pair 
substitutions and deletions between the quail and chicken sequences, so the number of 

30 bases does not correspond exactly.) 

Bp 5712 - 6910 Quail Ovalbumin gene, EMBL accession # X53964, bp 1-1 199. (This 
sequence includes the 5'UTR, containing putative cap site bp 5712-5764.) 
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Bp 6917 - 7066 Synthetic spacer sequence and hairpin loop of HTV gp41 with an 
added enterokinase cleavage site 

Bp 7067 - 7327 Human proinsulin GenBank Accession # NM000207 bp 11 7-377 
Bp 7328 - 7372 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
5 Blunt II (Invitrogen) and gWIZ (Gene Therapy Systems) 

Bp 7373 - 7724 Synthetic polyA from the cloning vector gWIZ (Gene Therapy 
Systems) bp 1920-2271 

Bp 7726- 1 1325 from cloning vector pTnMCS, bp 3716 - 7315 



10 pTnMCS (CHOVep-prepro-ent-hGH-CPA) 
Bp 1 - 3670 from vector PTnMCS, bp 1-3670 

Bp 3676 - 4350 Chicken Ovalbumin enhancer taken from GenBank accession # 
S82527.1,bp 1-675 

Bp 4357 - 5692 Chicken Ovalbumin promoter taken from GenBank accession # 
15 J00899-M24999, bp 1-1336 

Bp 5699 - 5869 Capsite/Prepro taken fron GenBank accession # X07404, bp 563-733 
Bp 5877 - 6025 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 

Bp 6026 - 6679 Human growth hormone taken from GenBank accession # V00S19, 
20 bp 1-654 

Bp 6686 - 7093 Conalbumin polyA taken from GenBank accession # Y00407, bp 
10651-11058 

Bp 7095 - 10694 from cloning vector pTnMCS, bp 3716-7315 

25 pTnMCSrOuail OVep+prepro+ENT+proins+svn polvA'^ 
Bp 1 - 3670 from cloning vector pTnMCS, bp 1 - 3670 

Bp 3676 - 4333 Quail Ovalbumin enhancer 658 bp sequence, amplified in-house 
from quail genomic DNA, roughly equivalent to the far- upstream chicken ovalbumin 
enhancer, GenBank accession #582527. 1, bp 1-675. (There are multiple base pair 
30 substitutions and deletions in the quail sequence, relative to chicken, so the number of 
bases does not correspond exactly.) 

Bp 4340 - 5705 Quail Ovalbumin promoter: 1366 bp sequence, amplified in-house 
from quail genomic DNA, roughly corresponding to chicken ovalbumin promoter, 
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GenBank accession # J00895-M24999 bp 1-1336. (There are multiple base pair 
substitutions and deletions between the quail and chicken sequences, so the number of 

• ■ 

bases does not correspond exactly.) 

Bp 5712 - 5882 Cecropin cap site and Prepro, Genbank accession # X07404 bp 563- 
5 733 

Bp 5889 - 6038 Synthetic spacer sequence and hairpin loop of HTV gp41 with an 
added enterokinase cleavage site 

Bp 6039 - 6299 Human proinsulin GenBank Accession # NM000207 bp 1 17-377 
Bp 6300 - 6344 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
10 Blunt II (Invitrogen) and gWlZ (Gene Therapy Systems) 

Bp 6345 - 6696 Synthetic polyA from the cloning vector gWIZ (Gene Therapy 
Systems) bp 1920-2271 

Bp 6698 - 10297 from cloning vector pTnMCS, bp 3716 - 7315 
15 PTnMOD 

Bp 1 - 130 remainder of Fl (-) ori of pBluescriptll sk(-) (Stratagene) bpl-130 

Bp 133 - 1777 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy 

Systems) bp229-1873 

Bp 1783 - 2991 Transposase, modified from TnlO (GenBank accession #J01829) bp 
20 108-1316 

Bp 2992 - 2994 Engineered stop codon 

Bp 2996 - 341 1 Synthetic polyA from gWIZ (Gene Therapy Systems) bp 1922 - 2337 

Bp 3412 - 3719 Non-coding DNA from vector pNK2859 

Bp 3720 - 3762 Lambda DNA from pNK2859 
25 Bp 3763 - 3832 70 bp of IS 10 left from TnlO 

Bp 3839 ' 4045 Multiple cloning site from pBluescriptU sk(-), thru Ae Xmal site bp 
924-718 

Bp 4046 - 4090 Multiple cloning site from pBluescriptU sk(-), from the Xmal site 
thru the Xhol site. These base pairs are usually lost when cloning into pTnMCS. bp 
30 717-673 

Bp 4091 - 4528 Multiple cloning site from pBluescriptU sk(-), from the Xhol site bp 
672-235 
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Bp 4534 - 4603 70 bp of IS 10 left from Tn 10 
Bp 4604 - 4645 Lambda DNA from pNK2859 
Bp 4646 - 5489 Non-coding DNA from.pNK2859 

Bp 5490 - 7690 pBluescript sk (-) base vector (Stratagene. INC) bp 761-2961 

5 

pTnMOD (CHOVep-prepro>ent-hGH-CPA^ 
Bp 1 - 4045 from vector PTnMCS, bp 1 - 4045 

Bp 4051 - 4725 Chicken Ovalbumin enhancer taken from GenBank accession # 
S82527.1, bp 1-675 

10 Bp 4732 - 6067 Chicken Ovalbumin promoter taken from GenBank accession # 
J00899-M24999, bp 1-1336 

Bp 6074 - 6245 Capsite/Prepro taken fron GenBank accession # X07404, bp 563 - 

733 

Bp 6252 - 6400 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
1 5 added enterokinase cleavage site 

Bp 6401 - 7054 Human growth hormone taken from GenBank accession # V00519, 
bp 1-654 

Bp 7061 - 7468 Conalbumin polyA taken from GenBank accession # Y00407, bp 
1065M1058 

20 Bp 7470 - 1 1069 from cloning vector pTnMCS, bp 3716-7315 

pTnMQD (CMV-CHQVg>ent-ProInsulin-svnPA) (SEP ID NQ:42^ 
Bp 1 -4045 from vector PTnMCS, bp 1 - 4045 

Bp 405 1 - 5695 CMV promoter/enhancer taken from vector pGWIZ (Gene therapy 
25 systems), bp 230- 1 864 

Bp 5702 -6855 Chicken ovalbumin gene taken from GenBank accession # V00383, 
bp 66-1219 

Bp 6862 - 7011 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 
30 Bp 7012 - 7272 Human Proinsulin taken from GenBank accession # NM000207, bp 
117-377 

Bp 7273 - 73 1 7 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
Blunt II (Invitrogen) and pGWIZ (Gene Therapy Systems) 
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Bp 73 1 8 - 7670 Synthetic polyA from, the cloning vector pGWIZ (Gene Therapy 
Systems), bp 1920-2271 

Bp 7672 -1 1271 from cloning vector pTnMCS, bp 3716-73 15 

5 pTnMOD fCMV-prepro-ent-hOH-CPA^ 

Bp 1 - 4045 from vector PTnMCS, bp 1 - 4045 

Bp 4051 - 5695 CMV promoter/enhancer taken from vector pGWIZ (Gene therapy 
systems), bp 230-1864 

Bp 5701 - 5871 Capsite/Prepro taken fron GenBank accession # X07404, bp 563 - 
10 733 

Bp 5879 - 6027 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 

Bp 6028 - 6681 Human growth hormone taken from GenBank accession # V00519, 
bp 1-654 

15 Bp 6688 - 7095 Conalbumin polyA taken from GenBank accession # Y00407, bp 
10651-11058 

Bp 7097 - 10696 from cloning vector pTnMCS, bp 3716-7315 

pTnMQD (CMV-prepro-ent-ProInsulin-svnPA) 
20 Bp 1 - 4045 from vector PTnMCS. bp 1 - 4045 . 

Bp 4051 - 5695 GMV promoter/enhancer taken from vector pGWIZ (Gene therapy 
systems), bp 230-1864 

Bp 5701 - 5871 Capsite/Prepro taken from GenBank accession # X07404, bp 563 - 
733 

25 Bp 5879 - 6027 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 

Bp 6028 - 6288 Human Proinsulin taken from GenBank accession # NM000207, bp 
117-377 

Bp 6289 - 6333 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
30 Blunt n (Invitrogen) and pGWIZ (Gene Therapy Systems) 

Bp 6334 - 6685 Synthetic polyA from the cloning vector pGWIZ (Gene Therapy 
Systems), bp 1920-2271 

Bp 6687 -10286 from cloning vector pTnMCS, bp 3716-7315 
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• ■ « 

DTnMOD(Chicken QVep-HQVg'-fENT+proins+svn polvA) (SEP ID NQ:43^ 
Bp 1 - 4045 from cloning vector pTnMOD, bp 1 - 4045 
Bp 4051 - 4725 Chicken Ovalbumin enhancer taken from GenBank accession # 
5 S82527.1 bp 1-675 

Bp 4732 - 6067 Chicken Ovalbumin promoter taken from GenBank accession # 
J00895-M24999 bp 1-1336 

Bp 6074 - 7292 Chicken Ovalbumin gene from GenBank Accession # V00383.1 bp 
2-1220. (This sequence includes the 5'UTR, containing putative cap site bp 6074- 
10 6137.) 

Bp 7299 - 7448 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 

Bp 7449 - 7709 Human proinsulin GenBank Accession # NM000207 bp 1 17-377 
Bp 7710 - 7754 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
15 Blunt II (Invitrogen) and gWIZ (Gene Therapy Systems) 

Bp 7755 - 8106 Synthetic polyA from the cloning vector gWIZ (Gene Therapy 
Systems) bp 1920-2271 

Bp 8108 - 1 1707 from cloning vector pTnMCS, bp 3716 - 7315 

20 pTnMQD(Chicken QVeiHTjrepro-HENT-Hjroins-fsvn PolvA) 
Bp 1-4045 from cloning vector pTnMCS, bp 1 - 4045' 

Bp 4051 - 4725 Chicken Ovalbumin enhancer taken from GenBank accession # 
S82527.1 bp 1-675 

Bp 4732 - 6067 Chicken Ovalbumin promoter taken from GenBank accession # 
25 J00895-M24999 bp 1-1336 

Bp 6074 - 6244 Cecropin cap site and Prepro, Genbank accession # X07404 bp S63- 
733 

Bp 6251 - 6400 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 
30 Bp 6401 - 666 1 Human proinsulin GenBank Accession # NM000207 bp 1 1 7-377 

Bp 6662 - 6706 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
Blunt II (Invitrogen) and gWIZ (Gene Therapy Systems) 
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. Bp 6707 - 7058 Synthetic polyA from the cloning vector gWIZ (Gene Therapy 
. Systems) bp 1 920 - 227 1 . 

Bp 7060 - 10659 from cloning vector pTnMCS, bp 3716-7315 

5 pTnMODfQuail OVep+OVg^+ENT+proins+svn polvA) • 
Bp 1 - 4045 from cloning vector pTnMCS, bp 1 - 4045 

Bp 4051 - 4708 Quail Ovalbumin enhancer: 658 bp sequence, amplified in-house 
from quail genomic DNA, roughly equivalent to the far-upstream chicken ovalbumin 
enhancer, GenBank accession # S82527.1, bp 1-675. (There are multiple base pair 
10 substitutions and deletions in the quail sequence, relative to chicken, so the number of 
bases does not correspond exactly.) 

Bp 4715 - 6080 Quail Ovalbumin promoter: 1366 bp sequence, amplified in-house 
from quail genomic DNA, roughly corresponding to chicken ovalbumin promoter, 
GenBank accession # J00895-M24999 bp 1-1336. (There are multiple base pair 
15 substitutions and deletions between the quail and chicken sequences, so the number of 
bases does not correspond exactly.) 

Bp 6087 - 7285 Quail Ovalbumin gene, EMBL accession # X53964, bp l-l 199. (This 
sequence includes the 5'UTR, containing putative cap site bp 6087-6139.) 
Bp 7292 - 7441 Synthetic spacer sequence and hairpin loop of HIV gp41 v^^ith an 
20 added enterokinase cleavage site 

* • • * 

Bp 7442 - 7702 Human proinsulin GenBank Accession # NM000207 bp 1 1 7-377 
Bp 7703 - 7747 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
Blunt II (Invitrogen) and gWIZ (Gene Therapy Systems) 

Bp 7748 - 8099 Synthetic polyA from the cloning vector gWIZ (Gene Therapy 
25 Systems) bp 1920-2271 

Bp 8101 - 11 700 from cloning vector pTnMCS, bp 3716 - 7315 



pTnMQDf Quail OVeo+prepro-f ENT+proins-<-svn polvA) 
Bp 1 - 4045 from cloning vector pTnMCS, bp 1 - 4045 
30 Bp 4051 - 4708 Quail Ovalbumin enhancer: 658 bp sequence, amplified in- 
housefrom quail genomic DNA, roughly equivalent to the far-upstream chicken 
ovalbumin enhancer, GenBank accession #S82527.1, bp 1-675. (There are multiple 
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base pair substitutions and deletions in the quail sequence, relative to chicken, so the 
number of bases does not correspond exactly.) 

Bp 4715 - 6080 Quail Ovalbumin pronioter: 1366 bp sequence, amplified inrhouse 
from quail genomic DNA, roughly corresponding to chicken ovalbumin promoter, 
5 GenBank accession # J00895-M24999 bp 1-1336. (There are multiple base pair 
substitutions and deletions between the quail and chicken sequences^ so the number of 
bases does not correspond exactly.) 

Bp 6087 - 6257 Cecropin cap site and Prepro, Genbank accession # X07404 bp 563- 
733 

10 Bp 6264 - 6413 Synthetic spacer sequence and hairpin loop of HIV gp41 with an 
added enterokinase cleavage site 

Bp 6414 - 6674 Human proinsulin GenBank Accession # NM000207 bp 1 17-377 
Bp 6675 - 6719 Spacer DNA, derived as an artifact from the cloning vectors pTOPO 
Blunt II (Invitrogen) and gWIZ (Gene Therapy Systems) 
15 Bp 6720 - 7071 Synthetic polyA from the cloning vector gWIZ (Gene Therapy 
Systems) bp 1920-2271 

Bp 7073 - 10672 from cloning vector pTnMCS. bp 3716 - 7315 

PTnMod(CMV/TransDosase/ChickOvep/prepro/ProteinAyConpolvA'> 
20 BP 1-130 remainder of Fl (-) ori of pBluescriptH sk(-) (Stragagene) bp 1-130. 

BP 133-1777 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy 
Systems) bp 229-1873. 

BP 1780-2987 Transposase, modified from TnlO (GenBank #101829). 
BP 2988-2990 Engineered stop codon. 
25 BP 299 1 -3343 non coding DNA from vector pNK2859. 
BP 3344-3386 Lambda DNA from pNK2859. 
BP 3387-3456 70bp of ISIO left from TnlO. 

BP 3457-3674 multiple cloning site from pBluescriptU sk(-) bp 924-707. 
BP 3675-5691 Chicken Ovalbumin enhancer plus promoter from a Topo Clone 10 
30 maxi 040303 (5* Xmal, 3' BamHI) 

BP 5698-5865 prepro with Cap site amplified from cecropin of pMON200 
GenBank # X07404 (5'BamHI, 3'KpnI) 
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BP 5872-7338 Protein A gene from GenBank# J01786, mature peptide bp 292-1755 
(5'KpnI. 3'SacIl) 

BP 7345-7752 ConPolyA from Chicken conalbumin polyA from GenBank # Y00407 
bp 1065 M 1058. (S'SacD, 3'XhoI) 
5 BP 7753-8 1 95 multiple cloning site from pBluescriptll sk(-) bp 677-235. 
BP 8196-8265 70 bp of ISIO left from TnlO. 
BP 8266-8307 Lamda DNA from pNK2859 
BP 8308-91 51 noncoding DNA from pNK2859 

BP 9152-1 1352 pBluescriptll sk(-) base vector (Stratagene, INC.) bp 761-2961 

10 

All patents, publications and abstracts cited above are incorporated herein by 
reference in their entirety. It should be understood that the foregoing relates only to 
preferred embodiments of the present invention and that numerous modifications or 
alterations may be made therein without departing from the spirit and the scope of the 
1 5 present invention as defined in the following claims. 
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CLAIMS 

We Claim: 

1 . A vector comprising: 
5 a) a transposase gene operably linked to a first promoter; and 

b) one or more genes of interest operably-linked to one or more 
additional promoters; 

c) wherein the one or more genes of interest and their operably- 
linked promoters are flanked by transposase insertion sequences recognized by 

10 the transposase; and 

d) wherein the first promoter comprises a modified Kozak 
sequence comprising ACCATG. 



2. The vector of claim 1, wherein one to twenty codons at a beginning of 
15 the transposase gene are modified by changing a nucleotide at a third base 

position of the codon to an adenine or thymine without modifying the amino 
acid encoded by the codon. 

3. The vector of claim 2, wherein the transposase gene is modified in its 
20 first ten codons. 

4. The vector of claim 1, wherein the transposase is a TnlO transposase. 

5. The vector of claim 1, wherein the first promoter is a constitutive 
25 promoter. 

6. The vector of claim 1, wherein the first promoter is an inducible 
promoter. 

30 7. The vector of claim 6, wherein the inducible promoter is an ovalbumin 

promoter or a vitellogenin promoter. 

8. The vector of claim 1 , wherein one gene of interest is operably-linked 
to a second promoter. 



35 



9. The vector of claim 8, wherein the second promoter is a constitutive 
promoter. 
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10. The vector of claim 8, wherein the second promoter is an inducible ' 
promoter. 

11. The vector of claim 10, wherein the inducible promoter is an 
5 ovalbumin promoter or a vitellogenin promoter. 

12. The vector of claim 1, further comprising a polyA sequence operably- 
linked to the transposase gene. 

10 13. The vector of claim 12, wherein the polyA sequence is a conalbumin 

polyA sequence. 

14. The vector of claim 1 or claim 12, further comprising two stop codons 
operably-linked to the transposase gene. 

15 

15. The vector of claim 1, wherein a first gene of interest is operably- 
linked to a second promoter and a second gene of interest is operably-linked to 
a third promoter. 

20 16. The vector of claim 1, wherein a first and a second gene of interest are 

operably-linked to a second promoter. 

17. The vector of claim 1, further comprising an enhancer operably-linked 
to the one or more genes of interest 

25 

18. The vector of claim 17, wherein the enhancer comprises at least a 
portion of an ovalbumin enhancer. 

19. The vector of claim 1, further comprising an egg directing sequence 
30 operably-linked to the one or more genes of interest. 

20. The vector of claim 19, wherein the egg directing sequence is an 
ovalbumin signal sequence or an ovomucoid signal sequence. 

35 21. The vector of claim 19, wherein the egg directing sequence is a 

vitellogenin targeting sequence. 
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22. A method for producing a transgenic animal comprising administering 
to the animal a vector comprising: 

a) a modified transposase gene operably linked to a first promoter; 

and 

5 b) one or more genes of interest operably-linked to one or more 

additional promoters; 

c) wherein the one or more genes of interest and their operably- 
linked promoters are flanked by transposase insertion sequences recognized by 
the transposase; and 

10 d) wherein the first promoter comprises a modified Kozak 

sequence comprising ACCATG. 



23. The method of claim 22, wherein one to twenty codons at a beginning 
of the transposase gene are modified by changing a nucleotide at a third base 
1 5 position of the codon to an adenine or thymine without modifying the amino 

acid encoded by the codon. 



20 



24. The method of claim 23, wherein the transposase gene is modified in 
its first ten codons. 

25. The method of claim 22, wherein the vector is administered via an 
intratesticular, intraarterial, intraoviductal or intraembryonic route. 

26. The method of claim 22, wherein the transjposase is a TnlO 
25 transposase. 



27. The method of claim 22, wherein the first promoter is a constitutive 
promoter. 

30 28. The method of claim 22, wherein the first promoter is an inducible 

promoter. 



29. The method of claim 28, wherein the inducible promoter is selected 
from the group consisting of an ovalbumin promoter, an ovomucoid promoter 

35 and a vitellogenin promoter. 

30. The method of claim 22, wherein one gene of interest is operably 
linked to a second promoter. 

99 
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31. The method of claim 30, wherein the second promoter is an inducible 
promoter. 

5 32.' The method of claim 31, wherein the inducible promoter is an 

ovalbumin promoter or a vitellogenin promoter. 

33. The method of claim 22, wherein the vector farther comprises a polyA 

sequence operably-linked to the transposase gene. 

10 

34. The method of claim 33, wherein the polyA sequence is a conalbumin 
polyA sequence. 

35. The method of claim 22 or claim 33, wherein the vector further 
15 comprises two stop codons operably-linked to the transposase gene. 

36. The method of claim 22, wherein a first gene of interest is operably- 
linked to a second promoter and a second gene of interest is operably-linked to 
a third promoter. 

20 

37. The method of claim 22, wherein a first and second gene of interest are 
operably linked to a second promoter. 

38. The method of claim 22, further comprising an enhancer operably- 
25 linked to the one or more genes of interest 

39. The method of claim 38, wherein the enhancer comprises at least a 
portion of an ovalbumin enhancer. 

30 40. The method of claim 22, wherein the animal is an avian animal. 

41. The method of claim 40, wherein the avian animal is a chicken or a 
quail. 

35 42. An egg produced by the avian animal of claim 40, wherein the egg 

contains one or more desired proteins encoded by the one or more genes of 
interest. 



ATLUB02 133492.1 



100 



wo 2004/003157 PCT/US2003/020389 

43. A transgenic sperm produced by the animal of claim 22. 

44. ' A method of producing a desired protein comprising: 

a) administering to an animal a vector comprising a modified 
5 transposase gene operably linked to a first promoter, and a gene of interest 

encoding the desired protein operably-linked to a second promoter; and 

b) isolating the desired protein produced in the animal; wherein 

c) the gene of interest and its operably-linked promoter are 
flanked by transposase insertion sequences recognized by the transposase; and 

10 d) the first promoter comprises a modified Kozak sequence 

comprising ACCATG. 

45- The method of claim 44, wherein one to twenty codons at a beginning 
of the transposase gene are modified by changing a nucleotide at a third base 
1 5 position of the codon to an adenine or thymine without modifying the amino 

acid encoded by the codon. 

46. The method of claim 44, wherein the animal is an egg-laying animal 
and the desired protein is isolated from an egg white. 

20 

47. The method of claim 44, wherein the vector further comprises a TAG 
sequence and wherein the desired protein is purified using the TAG sequence. 

48. The method of claim 47, wherein the TAG sequence comprises a 
25 polynucleotide sequence encoding an antigenic portion of a gp41 protein, an 

enterokinase cleavage site and a spacer polynucleotide sequence. 

49. The method of claim 47, wherein the TAG sequence comprises a 
polynucleotide sequence shown in SEQ ID NO:22. 

30 

50. The method of claim 44, wherein the desired protein is a lytic protein. 

5 1 . The method of claim 44, wherein the vector further comprises a second 
gene of interest operably-linked to a third promoter and wherein the genes of 

35 interest encode antibody polypeptides. 
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SEQ ID N0:1 (pTnMod) 

5 CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG 50 
CGCAGCGTGA CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCATCAGA TTGGCTATTG 150 
GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT TATATTGGCT 200 
CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 250 

10 TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 3 00 
CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 350 
CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 400 
GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA 450 
CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGACG 500 

15 TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA 550 
TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 600 
CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG 650 
ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG 700 
TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC 750 

20 CCCATTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 8 00 
GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCACGC 850 
TGTTTTGACC TCCATAGAAG ACACC6GGAC CGATCCAGCC TCCGCGGCCG 900 
GGAACGGTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT GACGTAAGTA 950 
CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 1000 

25 CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTG 1050 
ATGGTATAGC TTAGCCTATA GGTGTGGGTT ATTGACCATT ATTGACCACT 1100 
CCCCTATTGG TGACGATACT TTCCATTACT AATCCATAAC ATGGCTCTTT 1150 
GCCACAACTA TCTCTATTGG CTATATGCCA ATACTCTGTC CTTCAGAGAC 1200 
TGACACGGAC TCTGTATTTT TACAGGATGG GGTCCCATTT ATTATTTACA 1250 

30 AATTCACATA TACAACAACG CCGTCCCCCG TGCCCGCAGT TTTTATTAAA 1300 
CATAGCGTGG GATCTCCACG CGAATCTCGG GTACGTGTTC CGGACATGGG 1350 
CTCTTCTCCG GTAGCGGCGG AGCTTCCACA TCCGAGCCCT GGTCCCATGC 14 00 
CTCCAGCGGC TCATGGTCGC TCGGCAGCTC CTTGCTCCTA ACAGTGGAGG 1450 
CCAGACTTAG GCACAGCACA ATGCCCACCA CCACCAGTGT GCCGCACAAG 1500 

35 GCCGTGGCGG TAGGGTATGT GTCTGAAAAT GAGCGTGGAG ATTGGGCTCG 1550 
CACGGCTGAC GCAGATGGAA GACTTAAGGC AGCGGCAGAA GAAGATGCAG 1600 
GCAGCTGAGT TGTTGTATTC TGATAAGAGT CAGAGGTAAC TCCCGTTGCG 1650 
GTGCTGTTAA CGGTGGAGGG CAGTGTAGTC TGAGCAGTAC TCGTTGCTGC 1700 
CGCGCGCGCC ACCAGACATA ATAGCTGACA GACTAACAGA CTGTTCCTTT 1750 

40 CCATGGGTCT TTTCTGCAGT CACCGTCGGA CCATGTGTGA ACTTGATATT 1800 
TTACATGATT CTCTTTACCA ATTCTGCCCC GAATTACACT TAAAACGACT 1850 
CAACAGCTTA ACGTTGGCTT GCCACGCATT ACTTGACTGT AAAACTCTCA 1900 
CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAGCGAG AACAAAACAT 1950 
AACATCAAAC GAATCGACCG ATTGTTAGGT AATCGTCACC TCCACAAAGA 2000 

45 GCGACTCGCT GTATACCGTT GGCATGCTAG CTTTATCTGT TCGGGAATAC 2 050 
GATGCCCATT GTACTTGTTG ACTGGTCTGA TATTCGTGAG CAAAAACGAC 2100 
TTATGGTATT GCGAGCTTCA GTCGCACTAC ACGGTCGTTC TGTTACTCTT 2150 
TATGAGAAAG OGTTCCCGCT TTCAGAGCAA TGTTCAAAGA AAGCTCATGA 2200 
CCAATTTCTA GCCGACCTTG CGAGCATTCT ACCGAGTAAC ACCACACCGC 2250 

50 TCATTGTCAG TGATGCTGGC TTTAAAGTGC CATGGTATAA ATCCGTTGAG 2300 
AAGCTGGQTT GGTACTGGTT AAGTCGAGTA AGAGGAAAAG TACAATATGC 2350 
AGACCTAGGA GCGGAAAACT G6AAACCTAT CAGCAACTTA CATGATATGT 2400 
CATCTAGTCA CTCAAAGACT TTAGGCTATA AGAGGCTGAC TAAAAGCAAT 2450 
CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTCGCTCTA AAGGCCGAAA 2500 

55 AAATCAGCGC TCGACACGGA CTCATTGTCA CCACCCGTCA CCTAAAATCT 2550 
ACTCAGCGTC GGCAAAGGAG CCATGGGTTC TAGCAACTAA CTTACCTGTT 2600 
GAAATTCGAA CACCCAAACA ACTTGTTAAT ATCTATTCGA AGCGAATGCA 2650 
GATTGAAGAA ACCTTCCGAG ACTTGAAAAG TCCTGCCTAC GGACTAGGCC 2700 
TACGCCATAG CCGAACGAGC AGCTCAGAGC GTTTTGATAT CATGCTGCTA 2750 

60 ATCGCCCTGA TGCTTCAACT AACATGTTGG CTTGCGGGCG TTCATGCTCA 2800 
GAAACAAGGT TGGGACAAGC ACTTCCAGGC TAACACAGTC AGAAATCGAA 2 850 
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■ ACGTACTCTC AACAGTTCGC -TTAGGCATGG AAGTTTTGCG GCATTCTGGC 2900 
TACACAATAA CAAGGGAAGA CTTACTCGTG GCTGCAACCC TACTAGCTCA 2 950 
AAATTTATTC ACACATGGTT ACGCTTTGGG GAAATTATGA TAATGATCCA 3 000 
GATCACTTCT GGCTAATAAA AGATCAGAGC TCTAGAGATC TGTGTGTTGG 3050 
5 TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT GCCAGCCATC TGTTGTTTGC 3100 
' CCCTCCCCCG TGCCTTCCTT GACCCTGGAA GGTGCCACTC CCACTGTCCT 3150 
TTCCTAATAA AATGAGGAAA TTGCATCGCA TTGTCTGAGT AGGTGTCATT 3200 
CTATTCTGGG GGGTGGGGTG GGGCAGCACA GCAAGGGGGA GGATTGGGAA 3250 
GACAATAGCA GGCATGCTGG GGATGCGGTG GGCTCTATGG GTACCTCTCT 3300 

10 CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCGGTAC CTCTCTCTCT 3350 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCAGG TGCTGAAGAA 3400 
TTGACCCGGT GACCAAAGGT GCCTTTTATC ATCACTTTAA AAATAAAAAA 3450 
CAATTACTCA GTGCCTGTTA TAAGCAGCAA TTAATTATGA TTGATGCCTA 3500 
CATCACAACA AAAACTGATT TAACAAATGG TTGGTCTGCC TTAGAAAGTA 3 550 

15 TATTTGAACA TTATCTTGAT TATATTATTG ATAATAATAA AAACCTTATC 3600 
CCTATCCAAG AAGTGATGCC TATCATTGGT TGGAATGAAC TTGAAAAAAA 3 650 
TTAGCCTTGA ATACATTACT GGTAAGGTAA ACGCCATTGT CAGCAAATTG 3 700 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG ACGGAATGTT AATTCTCGTT 3750 
GACCCTGAGC ACTGATGAAT CCCCTAATGA TTTTGGTAAA AATCATTAAG 3 800 

20 TTAAGGTGGA TACACATCTT GTCATATGAT CCCGGTAATG TGAGTTAGCT 3 850 
CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 3 900 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 3 950 
CATGATTACG CCAAGCGCGC AATTAACCCT CACTAAAGGG AACAAAAGCT 4 000 
GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG AACTAGTGGA TCCCCCGGGC 4 050 

25 TGCAGGAATT CGATATCAAG CTTATCGATA CCGCTGACCT CGAGGGGGGG 4100 
CCCGGTACCC AATTCGCCCT ATAGTGAGTC GTATTACGCG CGCTCACTGG 4150 
CCGTCGTTTT ACAACGTCGT GACTGGGAAA ACCCTGGCGT TACCCAACTT 4200 
AATCGCCTTG CAGCACATCC CCCTTTCGCC AGCTGGCGTA ATAGCGAAGA 4250 
GGCCCGCACC GATCGCCCTT CCCAACAGTT GCGCAGCCTG AATGGCGAAT 4300 

30 GGAAATTGTA AGCGTTAATA TTTTGTTAAA ATTCGCGTTA AATTTTTGTT 4350 
AAATCAGCTC ATTTTTTAAC CAATAGGCCG AAATCGGCAA AATCCCTTAT 4400 
AAATCAAAAG AATAGACCGA GATAGGGTTG AGTGTTGTTC CAGTTTGGAA 4450 
CAAGA6TCCA CTATTAAAGA ACGTGGACTC CAACGTCAAA 6GGCGAAAAA 4500 
CCGTCTATCA GGGCGATGGC CCACTACTCC GGGATCATAT GACAAGATGT 4550 

35 GTATCCACCT TAACTTAATG ATTTTTACCA AAATCATTAG GGGATTCATC 4600 
AGTGCTCAGG GTCAACGAGA ATTAACATTC CGTCAGGAAA GCTTATGATG 4650 
ATGATGTGCT TAAAAACTTA CTCAATGGCT . GGTTATGCAT ATCGCAATAC 4700 
ATGCGAAAAA CCTAAAAGAG CTTGCCGATA AAAAAGGCCA ATTTATTGCT 4 750 
ATTTACCGCG GCTTTTTATT GAGCTTGAAA GATAAATAAA ATAGATAGGT 4800 

40 TTTATTTGAA GCTAAATCTT CTTTATCGTA AAAAATGCCC TCTTGGGTTA 4 850 
TCAAGAGGGT CATTATATTT CGCGGAATAA CATCATTTGG TGACGAAATA 4 900 
ACTAAGCACT TGTCTCCTGT TTACTCCCCT GAGCTTGAGG GGTTAACATG 4 950 
AAGGTCATCG ATAGCAGGAT AATAATACAG TAAAACGCTA AACCAATAAT 5000 
CCAAATCCAG CCATCCCAAA TTGGTAGTGA ATGATTATAA ATAACAGCAA 5O50 

45 ACAGTAATGG GCCAATAACA CCGGTTGCAT TGGTAAGGCT CACCAATAAT 5100 
CCCTGTAAAG CACCTTGCTG ATGACTCTTT GTTTGGATAG ACATCACTCC 5150 
• CTGTAATGCA GGTAAAGCGA TCCCACCACC AGCCAATAAA ATTAAAACAG 5200 
GGAAAACTAA CCAACCTTCA GATATAAACG CTAAAAAGGC AAATGCACTA 5250 
CTATCTGCAA TAAATCCGAG CAGTACTGCC GTTTTTTCGC CCATTTAGTG 5300 

50 GCTATTCTTC CTGCCACAAA GGCTTGGAAT ACTGAGTGTA AAAGACCAAG 5350 
ACCCGTAATG AAAAGCCAAC CATCATGCTA TTCATCATCA CGATTTCTGT 5400 
AATAGCACCA CACCGTGCTG GATTGGCTAT CAATGCGCTG AAATAATAAT 5450 
CAACAAATGG CATCGTTAAA TAAGTGATGT ATACCGATCA GCTTTTGTTC 5500 
CCTTTAGTGA GGGTTAATTG CGCGCTTGGC GTAATCATGG TCATAGCTGT 5550 

55 TTCCTGTGTG AAATTGTTAT CCGCTCACAA TTCCACACAA CATACGAGCC 5600 
GGAAGCATAA AGTGTAAAGC CTGGGGTGCC TAATGAGTGA GCTAACTCAC 5650 
ATTAATTGCG TTGCGCTCAC TGCCC6CTTT CCAGTCGGGA AACCTGTCGT 5700 
GCCAGCTGCA TTAATGAATC GGCCAACGCG CGGGGAGAGG CGGTTTGCGT 5750 
ATTGGGCGCT CTTCCGCTTC CTCGCTCACT GACTCGCTGC GCTCGGTCGT 5800 

60 TCGGCTGCGG CGAGCGGTAT CAGCTCACTC AAAGGCGGTA ATACGGTTAT 5850 
CCACAGAATC AGGGGATAAC GCAGGAAAGA ACATGTGAGC AAAAGGCCAG 5900 
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CAAAAGGGCA'GGAACCGTAA AAAGGCCGCG TTGCTGGCGT TTTTCCATAG 5 950- 
GCTCCGCCCC CCTGACGAGC ATCACAAAAA TCGACGCTCA AGTCAGAGGT 6 000 
GGCGAAACCC GACAGGACTA TAAAGATACC AGGCGTTTCC CCCTGGAAGC 6 050 
TCCCTCGTGC GCTCTCCTGT TCCGACCCTG CCGCTTACCG GATACCTGTC 6100 
5 CGCCTTTCTC CCTTCGG6AA GCGTGGCGCT TTCTCATAGC TCACGCTGTA 6150 
GGTATCTCAG TTCGGTGTAG GTCGTTCGCT CCAAGCTGGG CTGTGTGCAC 6200 
GAACCCCCCG TTCAGCCCGA CCGCTGCGCC TTATCCGGTA ACTATCGTCT 6250 
TGAGTCCAAC CCGGTAAGAC ACGACTTATC GCCACTGGCA GCAGCCACTG 6300 
GTAACAGGAT TAGCAGAGCG AGGTATGTAG GCGGTGCTAC AGAGTTCTTG 6350 

10 AAGTGGTGGC CTAACTACGG CTACACTAGA AGGACAGTAT TTGGTATCTG 6400 
CGCTCTGCTG AAGCCAGTTA CCTTCGGAAA AAGAGTTGGT AGCTCTTGAT 6450 
CCGGCAAACA AACCACCGCT GGTAGCGGTG GTTTTTTTGT TTGCAAGCAG 6500 
CAGATTACGC GCAGAAAAAA AGGATCTCAA GAAGATCCTT TGATCTTTTC 6550 
TACGGGGTCT GACGCTCAGT GGAACGAAAA CTCACGTTAA GGGATTTTGG 6600 

15 TCATGAGATT ATCAAAAAGG ATCTTCACCT AGATCCTTTT AAATTAAAAA 6 650 
TGAAGTTTTA AATCAATCTA AAGTATATAT GAGTAAACTT GGTCTGACAG 6 700 
TTACCAATGC TTAATCAGTG AGGCACCTAT CTCAGCGATC TGTCTATTTC 6750 
GTTCATCCAT AGTTGCCTGA CTCCCCGTCG TGTAGATAAC TACGATACGG 6 800 
GAGGGCTTAC CATCTGGCCC CAGTGCTGCA ATGATACCGC GAGACCCACG 6 850 

20 CTCACCGGCT CCAGATTTAT CAGCAATAAA CCAGCCAGCC GGAAGGGCCG 6 90 0 
AGCGCAGAAG TGGTCCTGCA ACTTTATCCG CCTCCATCCA GTCTATTAAT 6 950 
TGTTGCCGGG AAGCTAGAGT AAGTAGTTCG CCAGTTAATA GTTTGCGCAA 7 000 
CGTTGTTGCC ATTGCTACAG GCATCGTGGT GTCACGCTCG TCGTTTGGTA 7 05 0 
TGGCTTCATT CAGCTCCGGT TCCCAACGAT CAAGGCGAGT TACATGATCC 7100 

25 CCCATGTTGT GCAAAAAAGC GGTTAGCTCC TTCGGTCCTC CGATCGTTGT 7150 
CAGAAGTAAG TTGGCCGCAG TGTTATCACT CATGGTTATG GCAGCACTGC 7200 
ATAATTCTCT TACTGTCATG CCATCCGTAA GATGCTTTTC TGTGACTGGT 7250 
GAGTACTCAA CCAAGTCATT CTGAGAATAG TGTATGCGGC GACCGAGTTG 7300 
CTCTTGCCCG GCGTCAATAC GGGATAATAC CGCGCCACAT AGCAGAACTT 7350 

30 TAAAAGTGCT CATCATTGGA AAACGTTCTT CGGGGCGAAA ACTCTCAAGG 7400 
ATCTTACCGC TGTTGAGATC CAGTTCGATG TAACCCACTC GTGCACCCAA 7450 
CTGATCTTCA GCATCTTTTA CTTTCACCAG CGTTTCTGGG TGAGCAAAAA 7500 
CAGGAAGGCA AAATGCCGCA AAAAAGGGAA TAAGGGCGAC ACGGAAATGT 7550 
TGAATACTCA TACTCTTCCT TTTTCAATAT TATTGAAGCA TTTATCAGGG 7600 

35 TTATTGTCTC ATGAGCGGAT ACATATTTGA ATGTATTTAG AAAAATAAAC 7650 
AAATAGGGGT TCCGCGCACA TTTCCCCGAA AAGTGCCAC 7689 



SEQ ID N0:2 (PTnMod (CMV/Red) ) 

40 

CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG 50 
CGCAGCGTGA CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCATCAGA TTGGCTATTG 150 
GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT TATATTGGCT 200 

45 CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 250 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 3 00 
CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 350 
CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 400 
GGGACTTTCC ATTGACGTCA ATGGGT6QAG TATTTACGGT AAACTGCCCA 450 

50 CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGACG 500 
TCAATGACXK3 TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA 550 
TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 600 
CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG 650 
ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG 700 

55 TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC 750 
CCCATTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 800 
GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCACGC 850 
TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG 900 
GGAACGGTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT GACGTAAGTA 950 

60 CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 1000 
CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTG 1050 
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■ATGGTATAGC TTAGCCTATA GGTGTGGGTT 
CCCCTATTGG TGACGATACT TTCCATTACT 
GCCACAACTA TCTCTATTGG CTATATGCCA 
TGACACGGAC TCTGTATTTT TACAGGATGG 
5 AATTCACATA TACAACAACG CCGTCCCCCG 
CATAGCGTGG GATCTCCACG CGAATCTCGG 
CTCTTCTCCG GTAGCGGCGG AGCTTCCACA 
CTCCAGCGGC TCATGGTCGC TCGGCAGCTC 
CCAGACTTAG GCACAGCACA ATGCCCACCA 

10 GCCGTGGCGG TAGGGTATGT GTCTGAAAAT 
CACGGCTGAC GCAGATGGAA GACTTAAGGC 
GCAGCTGAGT TGTTGTATTC TGATAAGAGT 
GTGCTGTTAA CGGTGGAGGG CAGTGTAGTC 
CGCGCGCGCC ACCAGACATA ATAGCTGACA 

15 CCATGGGTCT TTTCTGCAGT CACCGTCGGA 
TTACATGATT CTCTTTACCA ATTCTGCCCC 
CAACAGCTTA ACGTTGGCTT GCCACGCATT 
CTCTTACCGA ACTTGGCCGT AACCTGCCAA 
AACATCAAAC GAATCGACCG ATTGTTAGGT 

20 GCGACTCGCT GTATACCGTT GGCATGCTAG 
GATGCCCATT GTACTTGTTG ACTGGTCTGA 
TTATGGTATT GCGAGCTTCA GTCGCACTAC 
TATGAGAAAG CGTTCCCGCT TTCAGAGCAA 
CCAATTTCTA GCCGACCTTG CGAGCATTCT 

25 TCATTGTCAG TGATGCTGGC TTTAAAGTGC 
AAGCTGGGTT GGTACTGGTT AAGTCGAGTA 
AGACCTAGGA GCGGAAAACT GGAAACCTAT 
CATCTAGTCA CTCAAAGACT TTAGGCTATA 
CCAATCTCAT GCCAAATTCT ATTGTATAAA 

30 AAATCAGCGC TCGACACGGA CTCATTGTCA 
ACTCAGCGTC GGCAAAGGAG CCATGGGTTC 
GAAATTCGAA CACCCAAACA ACTTGTTAAT 
GATTGAAGAA ACCTTCCGAG ACTTGAAAAG 
TACGCCATAG CCGAACGAGC AGCTCAGAGC 

35 ATCGCCCTGA TGCTTCAACT AACATGTTGG 
GAAACAAGGT TGGGACAAGC ACTTCCAGGC 
. .ACGTACTCTC AACAGTTCGC TTAGGCATGG 
TACACAATAA CAAGGGAAGA CTTACTCGTG 
AAATTTATTC ACACATGGTT ACGCTTTGGG 

40 GATCACTTCT GGCTAATAAA AGATCAGAGC 
TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT 
CCGTCCCCCG TGCCTTCCTT GACCCTGGAA 
TTCCTAATAA AATGAGGAAA TTGCATCGCA 
CTATTCTGGG GGGTGGGGTG GGGCAGCACA 

45 GACAATAGCA GGCATGCTGG GGATGCGGTG 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT 
TTGACCCGGT GACCAAAGGT GCCTTTTATC 
CAATTACTCA GTGCCTGTTA TAAGCAGCAA 

50 CATCACAACA AAAACTGATT TAACAAATGG 
TATTTGAACA TTATCTTGAT TATATTATTG 
CCTATCCAAG AAGTGATGCC TATCATTGGT 
TTAGCCTTGA ATACATTACT GGTAAGGTAA 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG 

55 GACCCTGAGC ACTGATGAAT CCCCTAATGA 
TTAAGGTGGA TACACATCTT GTCATATGAT 
CACTCATTAG GCACCCCAGG CTTTACACTT 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC 
CATGATTACG CCAAGCGCGC AATTAACCCT 

60 GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG 
ATCAGATTGG CTATTGGCCA TTGCATACGT 
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ATTGACCACT 


1100 

A Jto V V 




ATGGCTCTTT 


1150 


ATA PTPTGTC 
n X tw* x\t A\m 


CTTCAGAGAC 


1200 




• ATTATTTACA 

M» X X « & XXX 


1250 

A M w w 




TTTTATTAAA 

XXX X X X •^u^Ua 


1300 




CGGACATGGG 


1350 

X W V W 


TCCdAGCCCT 


GGTCCCATGC 


1400 


^ X X X X n 


ACAGTGGAGG 


1450 




GCCGCACAAG 


1500 




ATTGGGPTPG 

nX X wOwV^ X LVj 


X^ w V 


Af^pnncAnAA 


GAAGATGPAG 


X O w V 




TP P PfSTTfZfVS 

X W wwXJ X X VJWvJ 


XQ 9 W 


TfSAOPAfiTAP 


TPfZTTGPTGP 

XwV7X A WW X WW 


1700 

X r w w 


GAPTAAPAGA 


P TPTT P PTTT 

W X X X W W XXX 


1750 

X r ^ w 




nw X X WfV X X X 


1 R 00 

X w V vy 


GAATTACACT 


TAAAACGACT 


1850 

X J w 


ACTTGACTGT 


AAAACT CTC A 


1900 


CCAAAGCGAG 


AACAAAACAT 


1950 


TU^TCGTCACC 


TCCACAAAGA 


2000 


CTTTATCTGT 


TCGGGAATAC 


2050 


TATTCGTGAG 


CAAAAACGAP 


2100 


APGGTPGTTC 


TGTTAPTPTT 

X w X X /^W X W X X 


2150 


TGTTPAAAGA 


AAGPTPATGA 


22 00 


APPGAGTAAP 


APPAPAPPGP 


22 f^O 


PATGGTATAA 


ATPPGTTGAn 




AnAt?nAAAAG 


TAPAATATGP 


^ J 3 w 


PAOPAAPTTA 


P ATG AT ATGT 
w#^ X wn X t\ X u X 


A ^ U V 


AGAGGPTGAP 


TAAAAGPAAT 


£ ^ U 


TPTPGPTPTA 


AACGPPGAAA 


2500 

^ -mJ \J \J 


PPAPPPGTPA 


PPTAAAATPT 


A ^ U 


TAGCAACTAA 


X X r> w ^ X ^7 X X 


2600 


ATPTATTPGA 

n X w X /I ^ X W 


AGPGAATGPA 


2f;R0 

A Q 9 U 


TCCTGPCTAC 
X ^ w X www X nXm 


GGACTAGGPP 


2700 


wX X X Xwnxnx 


PATGPTGPTA 
X V w X WW X A 


27(;rt 


PTTGPfiGfiPG 


TTPATfSPTPA 
X X wn 1 X wn 


2fi nn 
uu 




ACAAATPnAA 
AuAAA X l^onn 




• ntWJX X X XV9^V3 


rtP A TTPT^2^2P 




nPTfZPA APPP 


TAPTAPPTPa 
X /\w X J\\J\* X 




rJAAATTATt^A 


TAATPATPPA 


J V/ U U 


TPTAHAflATP 

X W X r\\Jt\\3n X ^ 


XVjXwlwX X 


J V 3 u 


GPPAGPPATP 

WW w<nv3 WW**. X w 


TGTTGT TTnP 

X VJ> X X w XXX VJW 


1 00 

J X. V u 


GGTGCCACTC 

wv7 X V7W W<»W X W 


PPAPTGTPPT 
w w^rx w X w X WW X 


1 mo 


TTGTCTGAGT 


AGGTGT P ATT 

£x\J\J X \J X Vir«A X X 


3200 


GCTIAGGGGGA 


GGATTGGGAA 

N7w<nx X w w wJ^J^ 


32^0 

w 4!> ^ w 


GGPTPTATGG 

w\J W X W X rv X V7V7 


GT AP PT PTPT 

W X X\W W X W X W X 


'^'^00 
J J Vv 


CTCTCGGTAC 


CTCTCTCTCT 

w X w X w X ^ X X 


35 50 


CGGTACPAGG 


TGPTGAAGAA 


34 00 

«J *s w u 


ATP APTTT AA 


AA AT A A A A A A 




TTAATTATGA 


TTGATGPPTA 

X XwnXwWWXr\ 


3500 


TTGGTCTGCP 

X X \JS9 X W X w W W 


TTAGAAAGTA 


•J ^ ^ w 


ATAATAATAA 


AAACCTTATC 

r\f\fW^ W X X f\ X w 


3600 
•J \j \j \j 


TGGAATQAAC 


TTGAAAAAAA 


3650 

w O ^ W 


ACGCCATTGT 


CAGCAAATTG 


3700 




AA 1 1 \m\3 L I 




TTTTGGTAAA 


AATCATTAAG 


3800 


CCCGGTAATG 


TGAGTTAGCT 


3850 


TATGCTTCCG 


GCTCGTATGT 


3900 


ACACAGGAAA 


CAGCTATGAC 


3950 


CACTAAAGGG 


AACAAAAGCT 


4000 


AACTAGTGGA 


TCCCCCGGGC 


4050 


TGTATCCATA 


TCATAATATG 


4100 
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TACATTTATA TTGGCTCATG TCCAACATTA 
ATTGACTAGT TATTAATAGT AATCAATTAC 
CATATATGGA GTTCCGCGTT ACATAACTTA 
TGACCGCCCA ACGACCCCCG CCCATTGACG 
5 CATAGTAACG CCAATAGGGA CTTTCCATTG 
TACGGTAAAC TGCCCACTTG GCAGTACATC 
ACGCCCCCTA TTGACGTCAA TGACGGTAAA 
CCAGTACATG ACCTTATGGG ACTTTCCTAC 
TA6TCATCGC TATTACCATG GTGATGCGGT 

10 CGTGGATAGC GGTTTGACTC ACGGGGATTT 
CGTCAATGGG AGTTTGTTTT GGCACCAAAA 
GTCGTAACAA CTCCGCCCCA TTGACGCAAA 
TGGGAGGTCT ATATAAGCAG AGCTCGTTTA 
GAGACGCCAT CCACGCTGTT TTGACCTCCA 

15 CCAGCCTCCG CGGCCGGGAA CGGTGCATTG 
AAGAGTGACG TAAGTACCGC CTATAGACTC 
TCTTATGCAT GCTATACTGT TTTTGGCTTG 
CCTTATGCTA TAGGTGATGG TATAGCTTAG 
ACCATTATTG ACCACTCCCC TATTGGTGAC 

20 CATAACATGG CTCTTTGCCA CAACTATCTC 
TCTGTCCTTC AGAGACTGAC ACGGACTCTG 
CCATTTATTA TTTACAAATT CACATATACA 
CGCAGTTTTT ATTAAACATA GCGTGGGATC 
GTGTTCCGGA CATGGGCTCT TCTCCGGTAG 

25 AGCCCTGGTC CCATGCCTCC AGCGGCTCAT 
CTCCTAACAG TGGAGGCCAG ACTTAGGCAC 
CAGTGTGCCG CACAAGGCCG TGGCGGTAGG 
GTGGAGATTG GGCTCGCACG GCTGACGCAG 
GCAGAAGAAG ATGCAGGCAG CTGAGTTGTT 

30 GGTAACTCCC GTTGCGGTGC TGTTAACGGT 
CAGTACTCGT TGCTGCCGCG CGCGCCACCA 
AACAGACTGT TCCTTTCCAT GGGTCTTTTC 
AGGGATCCAC CGGTCGCCAC CATGGTGCGC 
GGAGTTCATG CGCTTCAAGG TGCGCATGGA 

35 AGTTCGAGAT CGA666CGAG GGCGAGGGCC 
ACCGTGAAGC TGAAGGTGAC CAAGGGCGGC 
CATCCTGTCC CCCCAGTTCC AGTACGGCTC 
CCGCCGACAT CCCCGACTAC AAGAAGCTGT 
TGGGAGCGCG TGATGAACTT CGAGGACGGC 

40 GGACTCCTCC CTGCAGGACG GCTGCTTCAT 
GCGTGAACTT CCCCTCCGAC GGCCCCGTAA 
TGGGAGGCCT CCACCGAGCG CCTGTACCCC 
CGAGATCCAC AAGGCCCTGA AGCTGAAGGA 
AGTTCAAGTC CATCTACATG GCCAAGAAGC 

45 TACTACGTGG ACTCCAAGCT GGACATCACC 
CATCGTGGAG CAGTACGAGC GCACCGAGGG 
AGCGGCCGCG ACTCTAGATC ATAATCAGCC 
TTACTTGCTT TAAAAAACCT CCCACACCTC 
AATGAATGCA ATTGTTGTTG TTAACTTGTT 

50 ACAAATAAAG CAATAGCATC ACAAATTTCA 
CTGCATTCTA GTTGTGGCCC GGGCTGCAGG 
GATACCGCTG ACCTCGAGGG GGGGCCCGGT 
AGTCGTATTA CGCGCGCTCA CTGGCCGTCG 
GAAAACCCTG GCGTTACCCA ACTTAATCGC 

55 CGCCAGCTGG CGTAATAGCG AAGAGGCCCG 
AGTTGCGCAG CCTGAATG6C GAATGGAAAT 
TAAAATTCGC GTTAAATTTT TGTTAAATCA 
GCCGAAATCG GCAAAATCCC TTATAAATCA 
GTTGAGTGTT GTTCCAGTTT GGAACAAGAG 

60 ACTCCAACGT CAAAGGGCGA AAAACCGTCT 
CTCCGGGATC ATATGACAAG ATGTGTATCC 
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CCGCCATGTT 


GACATTGATT 


4150 


GGGGTCATTA 


GTTCATAGCC 


4200 


CGGTAAATGG 


CCCGCCTGGC 


4250 


TCAATAATGA 


CGTATGTTCC 


4300 


ACGTCAATGG 


GTGGAGTATT 


4350 


AAGTGTATCA 


TATGCCAAGT 


4400 


TGGCCCGCCT 


GGCATTATGC 


4450 


TTGGCAGTAC 


ATCTACGTAT 


4500 


TTTGGCAGTA 


CATCAATGGG 


4550 


CCAAGTCTCC 


ACCCCATTGA 


4600 


TCAACGGGAC 


TTTCCAAAAT 


4650 


TGGGCGGTAG 


GCGTGTAC6G 


4700 


GTGAACCGTC 


AGATCGCCTG 


4750 


TAGAAGACAC 


CGGGACCGAT 


4800 


GAACGCGGAT 


TCCCCGTGCC 


4850 


TATAGGCACA 


CCCCTTTGGC 


4900 


GGGCCTATAC 


ACCCCCGCTT 


4950 


CCTATAGGTG 


TGGGTTATTG 


5000 


GATACTTTCC 


ATTACTAATC 


5050 


TATTGGCTAT 


ATGCCAATAC 


5100 


TATTTTTACA 


GGATGGGGTC 


5150 


ACAACGCCGT 


CCCCCGTGCC 


5200 


TCCACGCGAA 


TCTCGGGTAC 


5250 




TCCACATCCG 


5300 


GGTCGCTCGG 


CAGCTCCTTG 


53S0 


AG C AC AATGC 


CCACCACCAC 

W XaxCftS^ W*»VlR* V#»«»V^ 


54 00 


GTATGTGTCT 


GAAAATGAGC 


5450 


ATGGAAGACT 


TAAGGCAGCG 


5500 


GTATTCTGAT 


AAGAGTCAGA 


5550 


GGAGGGCAGT 


GTAGTCTGAG 


5600 


GACATAATAG 


CTGACAGACT 


5650 


TGCAGTCACC 


GTCTCGCGAC 


5700 


TCCTCCAAGA 


ACGTCATCAA 


5750 


GGGCACCGTO 


AACGGCCACG 


5800 


GCCCCTACGA 


GGGCCACAAC 


5850 


CCCCTGCCCT 


TCGCCTGGGA 


5900 


CAAGGTGTAC 


GTGAAGCACC 


5950 


CCTTCCCCGA 


GGGCTTCAAG 


6000 


GGCGTGGTGA 


CCGTGACCCA 


6050 


CTACAAGGT6 


AAGTTCATCG 


6100 


TGCAGAAGAA 


GACCATGGGC 


6150 


CGCGACGGCG 


TGCTGAAGGG 


6200 


CGGCGGCCAC 


TACCTGGTGG 


6250 


CCGTGCAGCT 


GCCCGGCTAC 


6300 


TCCCACAACG 


AGGACTACAC 


6350 


CCGCCACCAC 


CTGTTC C TGT 


6400 

w ^ W \f 


ATACCACATT 


TGTAGAGGTT 


6450 


CCCCTGAACC 


TGAAACATAA 


6500 


TATTGCAGCT 


TATAATGGTT 


6550 


CAAATAAAGC 


ATTTTTTTCA 


6600 

W W W w 


AATTCGATAT 


CAAGCTTATC 


6650 


ACCCAATTCG 


CCCTATAGTG 


6700 


TTTTACAACG 


TCGTGACTGG 


6750 


CTTGCAGCAC 


ATCCCCCTTT 


6800 


CACCGATCGC 


CCTTCCCAAC 


6850 


TGTAAGCGTT 


AATATTTTGT 


6900 


GCTCATTTTT 


TAACCAATAG 


6950 


AAAGAATAGA 


CCGAGATAGG 


7000 


TCCACTATTA 


AAGAACGTGG 


7050 


ATCAGGGCGA 


TGGCCCACTA 


7100 


ACCTTAACTT 


AATGATTTTT 


7150 
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. ACCAAAATCA . TTAGGGGATT CATCAGTGCT CAGGGTCAAC GAGAATTAAC 72 0 0 
ATTCCGTCAG GAAAGCTTAT GATGATGATG TGCTTAAAAA CTTACTCAAT 72 50 
GGCTGGTTAT GCATATCGCA ATACATGCGA AAAACCTAAA AGAGCTTGCC 73 00 
GATAAAAAAG GCCAATTTAT TGCTATTTAC CGCGGCTTTT TATTGAGCTT 73 50 
5 GAAAGATAAA TAAAATAGAT AGGTTTTATT TGAAGCTAAA TCTTCTTTAT 74 00 
CGTAAAAAAT GCCCTCTTGG GTTATC/VAGA GGGTCATTAT ATTTCGCGGA 74 50 
ATAACATCAT TTGGTGACGA AATAACTAAG CACTTGTCTC CTGTTTACTC 7500 
CCCTGAGCTT GAGGGGTTAA CATGAAGGTC ATCGATAGCA GGATAATAAT 7550 
ACAGTAAAAC GCTAAACCAA TAATCCAAAT CCAGCCATCC CAAATTGGTA 7600 

10 GTGAATGATT ATAAATAACA GCAAACAGTA ATGGGCCAAT AACACCGGTT 7650 
GCATTGGTAA GGCTCACCAA TAATCCCTGT AAAGCACCTT GCTGATGACT 7700 
CTTTGTTTGG ATAGACATCA CTCCCTGTAA TGCAGGTAAA GCGATCCCAC 7750 
CACCAGCCAA TAAAATTAAA ACAGGGAAAA CTAACCAACC TTCAGATATA 7800 
AACGCTAAAA AGGCAAATGC ACTACTATCT GCAATAAATC CGAGCAGTAC 7850 

15 TGCCGTTTTT TCGCCCATTT AGTGGCTATT CTTCCTGCCA CAAAGGCTTG 7900 
GAATACTGAG TGTAAAAGAC CAAGACCCGT AATGAAAAGC CAACCATCAT 7950 
GCTATTCATC ATCACGATTT CTGTAATAGC ACCACACCGT GCTGGATTGG 8000 
CTATCAATGC GCTGAAATAA TAATCAACAA ATGGCATCGT TAAATAAGTG 8 050 
ATGTATACCG ATCAGCTTTT GTTCCCTTTA GTGAGGGTTA ATTGCGCGCT 8100 

20 TGGCGTAATC ATGGTCATAG CTGTTTCCTG TGTGAAATTG TTATCCGCTC 8150 
ACAATTCCAC ACAACATACG AGCCGGAAGC ATAAAGTGTA AAGCCTGGGG 82 00 
TGCCTAATGA GTGAGCTAAC TCACATTAAT TGCGTTGCGC TCACTGCCCG 8250 
CTTTCCAGTC GGGAAACCTG TCGTGCCAGC TGCATTAATG AATCGGCCAA 83 00 
CGCGCGGGGA GAGGCGGTTT GCGTATTGGG CGCTCTTCCG CTTCCTCGCT 83 50 

25 CACTGACTCG CTGCGCTCGG TCGTTCGGCT GCGGCGAGCG GTATCAGCTC 84 00 
ACTCAAAGGC GGTAATACGG TTATCCACAG AATCAGGGGA TAACGCAGGA 84 50 
AAGAACATGT GAGCAAAAGG CCAGCAAAAG GCCAGGAACC GTAAAAAGGC 8500 
CGCGTTGCTG GCGTTTTTCC ATAGGCTCCG CCCCCCTGAC GAGCATCACA 8550 
AAAATCGACG CTCAAGTCAG AGGTGGCGAA ACCCGACAGG ACTATAAAGA 8600 

30 TACCAGGCGT TTCCCCCTGG AAGCTCCCTC GTGCGCTCTC CTGTTCCGAC 8650 
CCTGCCGCTT ACCGGATACC TGTCCGCCTT TCTCCCTTCG GGAAGCGTGG 8700 
CGCTTTCTCA TAGCTCACGC TGTAGGTATC TCAGTTCGGT GTAGGTCGTT 8750 
CGCTCCAAGC TGGGCTGTGT GCACGAACCC CCCGTTCAGC CCGACCGCTG 8800 
CGCCTTATCC GGTAACTATC GTCTTGAGTC CAACCCGGTA AGACACGACT 8850 

35 TATCGCCACT GGCAGCAGCC ACTGGTAACA GGATTAGCAG AGCGAGGTAT 8900 
GTAGGCGGTG CTACAGAGTT CTTGAAGTGG TGGCCTAACT ACGGCTACAC 8950 
TAGAAGGACA GTATTTGGTA TCTGCGCTCT GCTGAAGCCA GTTACCTTOG 9000 
GAAAAAGAGT TGGTAGCTCT TGATCCGGCA AACAAACCAC CGCTGGTAGC 9050 
GGTGGTTTTT TTGTTTGCAA GCAGCAGATT ACGCGCAGAA AAAAAGGATC 9100 

40 TCAAGAAGAT CCTTTGATCT TTTCTACGGG GTCTGACGCT CAGTGGAACG 9150 
AAAACTCACG TTAAGGGATT TTGGTCATGA GATTATCAAA AAGGATCTTC 9200 
ACCTAGATCC TTTTAAATTA AAAATGAAGT TTTAAATCAA TCTAAAGTAT 9250 
ATATGAGTAA ACTTGGTCTG ACAGTTACCA ATGCTTAATC AGTGAGGCAC 9300 
CTATCTCAGC GATCTGTCTA TTTCGTTCAT CCATAGTTGC CTGACTCCCC 9350 

45 GTCGTGTAGA TAACTACGAT ACGGGAGGGC TTACCATCTG GCCCCAGTGC 9400 
TGCAATGATA CCGCGAGACC CACGCTCACC GGCTCCAGAT TTATCAGCAA 9450 
TAAACCAGCC AGCCGGAAGG GCCGAGCGCA GAAGTGGTCC TGCAACTTTA 9500 
TCCGCCTCCA TCCAGTCTAT TAATTGTTGC CGGGAAGCTA GAGTAAGTAG 9550 
TTCGCCAGTT AATAGTTTGC GCAACGTTGT TGCCATTGCT ACAGGCATCG 9600 

50 TGGTGTCACG CTCGTCGTTT GGTATGGCTT CATTCAGCTC CGGTTCCCAA 9650 
CGATCAAGGC GAGTTACATG ATCCCCCATG TTGTGCAAAA AAGCGGTTAG 9700 
CTCCTTCGGT CCTCCGATCG TTGTCAGAAG TAAGTTGGCC GCAGTGTTAT 9750 
CACTCATGGT TATGGCAGCA CTGCATAATT CTCTTACTGT CATGCCATCC 9800 
GTAAGATGCT TTTCTGTGAC TGGTGAGTAC TCAACCAAGT CATTCTGAGA 9850 

55 ATAGTGTATG CGGCGACCGA GTTGCTCTTG CCCGGCGTCA ATACGGGATA 9900 
ATACCGCGCC ACATAGCA6A ACTTTAAAAG TGCTCATCAT TGGAAAACGT 9950 
TCTTCGGGGC GAAAACTCTC AAGGATCTTA CCGCTGTTGA GATCCAGTTC 10000 
GATGTAACCC ACTCGTGCAC CCAACTGATC TTCAGCATCT TTTACTTTCA 10050 
CCAGCGTTTC TGGGTGAGCA AAAACAGGAA GGCAAAATGC CGCAAAAAAG 10100 

60 GGAATAAGGG CGACACGGAA ATGTTGAATA CTCATACTCT TCCTTTTTCA 10150 
ATATTATTGA AGCATTTATC AGGGTTATTG TCTCATGAGC GGATACATAT 10200 

108 

ATLUB02 133492.1 



wo 2004/003157 



PCT/US2003/020389 



TTGAATGTAT TTAGAAAAAT AAACAAAtAG GGGTTCCGCG CACATTTCCC 10250 
CGAAAAGTGC CAC 10263 



5 SEQ ID NO: 3 (PTnMod (Oval/Red) Chicken) 

CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG 50 
CGCAGCGTGA CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCATCAGA TTGGCTATTG 150 

10 GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT TATATTGGCT 200 
CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 250 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 300 
CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 350 
CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 4 00 

15 GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA 4 50 
CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGACG 5 00 
TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA 550 
TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 6 00 
CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG 650 

20 ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG 7 00 
TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC 750 
CCCATTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 8 00 
GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCACGC 8 50 
TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG 900 

25 GGAACGGTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT GACGTAAGTA 95 0 
CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 1000 
CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTG 1050 
ATGGTATAGC TTAGCCTATA GGTGTGGGTT ATTGACCATT ATTGACCACT 1100 
CCCCTATTGG TGACGATACT TTCCATTACT AATCCATAAC ATGGCTCTTT 1150 

30 GCCACAACTA TCTCTATTGG CTATATGCCA ATACTCTGTC CTTCAGAGAC 1200 
TGACACGGAC TCTGTATTTT TACAGGATGG GGTCCCATTT ATTATTTACA 1250 
AATTCACATA TACAACAACG CCGTCCCCCG TGCCCGCAGT TTTTATTAAA 1300 
CATAGCGTGG GATCTCCACG CGAATCTCGG GTACGTGTTC CGGACATGGG 1350 
CTCTTCTCCG GTAGCGGCGG AGCTTCCACA TCCGAGCCCT GGTCCCATGC 1400 

35 CTCCAGCGGC TCATGGTCGC TCGGCAGCTC CTTGCTCCTA ACAGTGGAGG 1450 
CCAGACTTA6 GCACAGCACA ATGCCCACCA CCACCAGTGT GCCGCACAAG 1500 
GCCGTGGCXX} TAGGGTAT6T GTCTGAAAAT GAGCGTGGAG ATTGGGCTCG 1550 
CACGGCTGAC GCAGATGGAA GACTTAAGGC AGCGGCA6AA GAAGATGCA6 1600 
GCAGCTGAGT TGTTGTATTC TGATAAGAGT CAGAGGTAAC TCCCGTTGCG 1650 

40 GTGCTGTTAA CGGTGGAGGG CAGTGTAGTC TGAGCAGTAC TCGTTGCTGC 1700 
CGCGCGCGCC ACCAGACATA ATAGCTGACA GACTAACAGA CTGTTCCTTT 1750 
CCATGGGTCT TTTCTGCAGT CACCGTCGGA CCATGTGTGA ACTTGATATT 1800 
TTACATGATT CTCTTTACCA ATTCTGCCCC GAATTACACT TAAAACGACT 1850 
CAACAGCTTA ACGTTGGCTT GCCACGCATT ACTTGACTGT AAAACTCTCA 1900 

45 CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAGCGAG AACAAAACAT 1950 
AACATCAAAC GAATCGACCG ATTGTTAGGT AATCGTCACC TCCACAAAGA 2000 
GCGACTCGCT GTATACCGTT GGCATGCTAG CTTTATCTGT TCGGGAATAC 2050 
GATGCCCATT GTACTTGTTG ACTGGTCTGA TATTCGTGAG CAAAAACGAC 2100 
TTATGGTATT GCGAGCTTCA GTCGCACTAC ACGGTCGTTC TGTTACTCTT 2150 

50 TATGAGAAAG CGTTCCCGCT TTCAGAGCAA TGTTCAAAGA AAGCTCATGA 2200 
CCAATTTCTA GCCGACCTTG CGAGCATTCT ACCGAGTAAC ACCACACCGC 2250 
TCATTGTCAG TGATGCTGGC TTTAAAGTGC CATGGTATAA ATCCGTTGAG 2300 
AAGCTGGGTT GGTACTGGTT AAGTCGAGTA AGAGGAAAAG TACAATATGC 2350 
AGACCTAGGA GCGGAAAACT GGAAACCTAT CAGCAACTTA CATGATATGT 24 00 

55 CATCTAGTCA CTCAAAGACT TTAGGCTATA AGAGGCTGAC TAAAAGCAAT 2450 
CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTCGCTCTA AAGGCCGAAA 2500 
AAATCAGCGC TCGACACGGA CTCATTGTCA CCACCCGTCA CCTAAAATCT 2550 
ACTCAGCGTC GGCAAAGGAG CCATGGGTTC TAGCAACTAA CTTACCTGTT 2600 
GAAATTCGAA CACCCAAACA ACTTGTTAAT ATCTATTCGA AGCGAATGCA 2650 

60 GATTGAAGAA ACCTTCCGAG ACTTGAAAAG TCCTGCCTAC GGACTAGGCC 2700 
TACGCCATAG CCGAACGAGC AGCTCAGAGC GTTTTGATAT CATGCTGCTA 2750 
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•ATCGCCCTGA TGCT-TCAACT AACATGTTGG 
GAAACAAGGT TGGGACAAGC ACTTCCAGGC 
ACGTACTCTC AACAGTTCGC TTAGGCATGG 
TACACAATAA CAAGGGAAGA CTTACTCGTG 
5 AAATTTATTC ACACATGGTT ACGCTTTGGG 
GATCACTTCT GGCTAATAAA AGATCAGAGC 
TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT 
CCCTCCCCCG TGCCTTCCTT GACCCTGGAA 
TTCCTAATAA AATGAGGAAA TTGCATCGCA 

10 CTATTCTGGG GGGTGGGGTG GGGCAGCACA 
GACAATAGCA GGCATGCTGG GGATGCGGTG 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT 
TTGACCCGGT GACCAAAGGT GCCTTTTATC 

15 CAATTACTCA GTGCCTGTTA TAAGCAGCAA 
CATCACAACA AAAACTGATT TAACAAATGG 
TATTTGAACA TTATCTTGAT TATATTATTG 
CCTATCCAAG AAGTGATGCC TATCATTGGT 
TTAGCCTTGA ATACATTACT GGTAAGGTAA 

20 ATCCAAGAGA ACCAACTTAA AGCTTTCCTG 
GACCCTGAGC ACTGATGAAT CCCCTAATGA 
TTAAGGTGGA TACACATCTT GTCATATGAT 
CACTCATTAG GCACCCCAGG CTTTACACTT 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC 

25 CATGATTACG CCAAGCGCGC AATTAACCCT 
GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG 
AGGTCAGAAT GGTTTCTTTA CTGTTTGTCA 
AACAATAGCT TCTATAACTG AAATATATTT 
TCCCTCGAAC CATGAACACT CCTCCAGCTG 

30 ATCTGCCAGG CCATTAAGTT ATTCATGGAA 
AGTTCATATC ATAAACACAT TTGAAATTGA 
GAGCTATGTT TTGCTGTATC CTCAGAAAAA 
ACACCCATAA AAAGATAGAT TTAAATATTC 
CGTCTGCTCT TCACTCTAGT CTCAGTTGGC 

35 TTATTTCTCC TATTTTGTCA AGAAAATAAT 
TTATGTCCTG CCTAGCATGG CTCAGATGCA 
TCAAATGAAA CAGACTTCTG GTCTGTTACT 
ACTAACTAAT AATTGCTAAT TATGTTTTCC 
TTTCTGTTTT CTTAAAGATC CCATTATCTG 

40 GAACATGAGC AATATTTCCC AGTCTTCTCT 
GATTAGCAGA ACAGGCAGAA AACACATTGT 
TATTTGCTCT CCATTCAATC CAAAATGGAC 
CCCAATCCCA TTAAATGATT TCTATGGCGT 
GGAACCTGTG GGTGGGTCAC AATTCAGGCT 

45 CGGATCTCCA TGGGCTCCAT CGGTGCAGCA 
TGTATTCAAG GAGCTCAAAG TCCACCATGC 
GCCCCATTGC CATCATGTCA GCTCTAGCCA 
GACAGCACCA GGGAATTCGT GCGCTCCTCC 
. . CATGCGCTTC AAGGTGCGCA TGGAGGGCAC 

50 AGATCGAGGG CGAGGGCGAG GGCCGCCCCT 
AAGCTGAAGG TGACCAAGGG CGGCCCCCTG 
GTCCCCCCAG TTCCAGTACG GCTCCAAGGT 
ACATCCCCGA CTACAAGAAG CTGTCCTTCC 
CGCGTGATGA ACTTCGAGGA CGGCGGCGTG 

55 CTCCCTGCAG GACGGCTGCT TCATCTACAA 
ACTTCCCCTC CGACGGCCCC GTAATGCAGA 
GCCTCCACCG AGCGCCTGTA CCCCCGCGAC 
CCACAAGGCC CTGAAGCTGA AGGACGGCGG 
AGTCCATCTA CATGGCCAAG AAGCCCGTGC 

60 GTGGACTCCA AGCTGGACAT CACCTCCCAC 
GGAGCAGTAC GAGCGCACCG AGGGCCGCCA 
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GTGGAGTTCA 
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CGCGACTCTA GATCATAATG AGCCATACCA 
GCTTTAAAAA ACCTCCCACA CCTCCCCCTG 
TGCAATTGTT GTTGTTAACT TGTTTATTGC 
AAAGCAATAG CATGACAAAT TTCACAAATA 
TCTAGTTGTG GCTCGAGAAG GGCGAATTCT 
CGGCCGCTCX3 AGGGGGGGCC CGGTACCCAA 
ATTACGCGCG CTCACTGGCC GTCGTTTTAC 
CCTGGCGTTA CCCAACTTAA TCGCCTTGCA 
CTGGCGTAAT AGCGAAGAGG CCCGCACCGA 
GCAGCCTGAA TGGCGAATGG AAATTGTAAG 
TCGCGTTAAA TTTTTGTTAA ATCAGCTCAT 
ATCGGCAAAA TCCCTTATAA ATCAAAAGAA 
TGTTGTTCCA GTTTGGAACA AGAGTCCACT 
ACGTCAAAGG GCGAAAAACC GTCTATCAGG 
GATCATATGA CAAGATGTGT ATCCACCTTA 
ATCATTAGGG GATTCATCAG TGCTCAGGGT 
TCAGGAAAGC TTATGATGAT GATGTGCTTA 
TTATGCATAT CGCAATACAT GCGAAAAACC 
AAAGGCCAAT TTATTGCTAT TTACCGCGGC 
TAAATAAAAT AGATAGGTTT TATTTGAAGC 
AAATGCCCTC TTGGGTTATC AAGAGGGTCA 
TCATTTGGTG ACGAAATAAC TAAGCACTTG 
GCTTGAGGGG TTAACATGAA GGTCATCGAT 
AAACGCTAAA CCAATAATCC AAATCCAGCC 
GATTATAAAT AACAGCAAAC AGTAATGGGC 
GTAAGGCTCA CCAATAATCC CTGTAAAGCA 
TTGGATAGAC ATCACTCCCT GTAATGCAGG 
CCAATAAAAT TAAAACAGGG AAAACTAACC 
AAAAAGGCAA ATGCACTACT ATCTGCAATA 
TTTTTCGCCC CATTTAGTGG CTATTCTTCC 
CTGAGTGTAA AAGACCAAGA CCCGCTAATG 
TTCCATCCAA AACGATTTTC GGTAAATAGC 
TTGGCCTATC AATTGCGCTG AAAAATAAAT 
TTTAAATAAA GTGATGTATA CCGAATTCAG 
GGTTAATTGC GCGCTTGGCG TAATCATGGT 
AATTGTTATC CGCTCACAAT TCCACACAAC 
GTGTAAAGCC TGGGGTGCCT AATGAGTGAG 
TGCGCTCACT GCCCGCTTTC CAGTCGGGAA 
TAATGAATCG GCCAACGCGC GGGGAGAGGC 
TTCCGCTTCC TCGCTCACTG ACTCGCTGCG 
GAGCGGTATC AGCTCACTCA AAGGCGGTAA 
GGGGATAACG CAGGAAAGAA CATGTGAGCA 
GAACCGTAAA AAGGCCGCGT TGCTGGCGTT 
CTGACGAGCA TCACAAAAAT CGACGCTCAA 
ACAGGACTAT AAAGATACCA GGCGTTTCCC 
CTCTCCTGTT CCGACCCTGC CGCTTACCGG 
CTTCGGGAAG CGTGGCGCTT TCTCATAGCT 
TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC 
TCAGCCCGAC CGCTGCGCCT TATCCGGTAA 
CGGTAAGACA CGACTTATCG CCACTGGCAG 
AGCAGAGCGA GGTATGTAGG CGGTGCTACA 
TAACTACGGC TACACTAGAA GGACAGTATT 
AGCCAGTTAC CTTCGGAAAA AGAGTTGGTA 
ACCACCGCTG GTAGCGGTGG TTTTTTTGTT 
CAGAAAAAAA GGATCTCAAG AAGATCCTTT 
ACGCTCAGXG GAACGAAAAC TCACGTTAAG 
TCAAAAAGGA TCTTCACCTA GATCCTTTTA 
ATCAATCTAA AGTATATATG AGTAAACTTG 
TAATCAGTGA GGCACCTATC TCAGCGATCT 
GTTGCCTGAC TCCCCGTCGT GTAGATAACT 
ATCTGGCCCC AGTGCTGCAA TGATACCGCG 
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CATTTGTAGA • GGTTTTACTT' 5850 
AACCTGAAAC ATAAAATGAA 5900 
AGCTTATAAT GGTTACAAAT 5950 
AAGCATTTTT TTCACTGCAT 6000 
GCAGATATCC ATCACACTGG 6050 . 
TTCGCCCTAT AGTGAGTCGT 6100- 
AACGTCGTGA CTGGGAAAAC 6150 
GCACATCCCC CTTTCGCCAG 6200 
TCGCCCTTCC CAACAGTTGC 6250 
CGTTAATATT TTGTTAAAAT 6300 
TTTTTAACCA ATAGGCCGAA 6350 
TAGACCGAGA TAGGGTTGAG 6400 
ATTAAAGAAC GTGGACTCCA 6450 
GCGATGGCCC ACTACTCCGG 6500 
ACTTAATGAT TTTTACCAAA 65 5 0 
CAACGAGAAT TAACATTCCG 6600 
AAAACTTACT CAATGGCTGG 6650 
TAAAAGAGCT TGCCGATAAA 6 7 00 
TTTTTATTGA GCTTGAAAGA 6750 
TAAATCTTCT TTATCGTAAA 6800 
TTATATTTCG CGGAATAACA 6850 
TCTCCTGTTT ACTCCCCTGA 6900 
AGCAGGATAA TAATACAGTA 6950 
ATCCCAAATT GGTAGTGAAT 7 000 
CAATAACACC GGTTGCATTG 7050 
CCTTGCTGAT GACTCTTTGT 7100 
TAAAGCGATC CCACCACCAG 7150 
AACCTTCAGA TATAAACGCT 7200 
AATCCGAGCA GTACTGCCGT 7250 
TGCCACAAAG GCTTGGAATA 7300 
AAAAGCCAAC CATCATGCTA 7350 
ACCCACACCG TTGCGGGAAT 7400 
AATCAACAAA ATGGCATCGT 7450 
CTTTTGTTCC CTTTAGTGAG 7500 
CATAGCTGTT TCCTGTGTGA 7550 
ATACGAGCCG GAAGCATAAA 7600 
CTAACTCACA TTAATTGCGT 7650 
ACCTGTCGTG CCAGCTGCAT 7700 
GGTTTGCGTA TTGGGCGCTC 7750 
CTCGGTCGTT CGGCTGCGGC 7800 
TACGGTTATC CACAGAATCA 7850 
AAAGGCCAGC AAAAGGCCAG 7900 
TTTCCATAGG CTCCGCCCCC 7950 
GTCAGAGGTG GCGAAACCCG 8000 
CCTGGAAGCT CCCTCGTGCG 8050 
ATACCTGTCC GCCTTTCTCC 8100 
CACGCTGTAG GTATCTCAGT 8150 
TGTGTGCACG AACCCCCCGT 8200 
CTATCGTCTT GAGTCCAACC 8250 
CAGCCACTGG TAACAGGATT 8300 
GAGTTCTTGA AGTGGTGGCC 8350 
TGGTATCTGC GCTCTGCTGA 84 00 
GCTCTTGATC CGGCAAACAA 8450 
TGCAAGCAGC AGATTACGCG 8500 
GATCTTTTCT ACGGGGTCTG 8550 
GGATTTTGGT CATGAGATTA 8600 
AATTAAAAAT GAAGTTTTAA 8650 
GTCTGACAGT TACCAATGCT 8700 
GTCTATTTCG TTCATCCATA 8750 
ACGATACGGG AGGGCTTACC 8800 
AGACCCACGC TCACCGGCTC 8850 
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CAGATTTATC AGCAATAAAC CAGGCAGCCG GAAGGGCCGA GCGCAGAAGT 8900 

GGTCCTGCAA CTTTATCCGC CTCCATCCAG TCTATTAATT GTTGCCGGGA 8 950 

AGCTAGAGTA AGTAGTTCGC CAGTTAATAG TTTGCGCAAC GTTGTTGCCA 9000 

TTGCTACAGG CATCGTGGTG TCACGCTCGT CGTTTGGTAT GGCTTCATTC 905.0 

5 AGCTCCGGTT CCCAACGATC AAGGCGAGTT ACATGATCCC CCATGTTGTG 9100 

CAAAAAAGCG GTTAGCTCCT- TCGGTCCTCC GATCGTTGTC AGAAGTAAGT 9150 

TGGCCGCAGT GTTATCACTC ATGGTTATGG CAGCACTGCA TAATTCTCTT 9200 

ACTGTCATGC CATCCGTAAG ATGCTTTTCT GTGACTGGTG AGTACTCAAC 9250 

CAAGTCATTC TGAGAATAGT GTATGCGGCG ACCGAGTTGC TCTTGCCCGG 9300 

10 CGTCAATACG GGATAATACC GCGCCACATA GCAGAACTTT AAAAGTGCTC 9350 

ATCATTGGAA AACGTTCTTC GGGGCGAAAA CTCTCAAGGA TCTTACCGCT 9400 

GTTGAGATCC AGTTCGATGT AACCCACTCG TGCACCCAAC TGATCTTCAG 9450 

CATCTTTTAC TTTCACCAGC GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA 9500 

AATGCCGCAA AAAAGGGAAT AAG6GCGACA CGGAAATGTT GAATACTCAT 9550 

15 ACTCTTCCTT TTTCAATATT ATTGAAGCAT TTATCAGGGT TATTGTCTCA 9600 

TGAGCGGATA CATATTTGAA TGTATTTAGA AAAATAAACA AATAGGGGTT 9650 
CCGCGCACAT TTCCCCGAAA AGTGCCAC 9678 



20 SEO ID NO: 4 (PTnMod (Oval/Red) Quail) 

CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG 50 
CGCAGCGTGA CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCATCAGA TTGGCTATTG 150 

25 GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT TATATTGGCT 2 00 
CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 250 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 3 00 
CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 3 50 
CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 4 00 

30 GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA 450 
CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGACG 500 
TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA 550 
TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 600 
CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG 650 

35 ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG 700 
TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC 750 
CCCATT6ACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 800 
GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCACGC 850 
TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG 900 

40 GGAACGGTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT GACGTAAGTA 950 
CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 1000 
CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTG 1050 
ATGGTATAGC TTAGCCTATA GGTGTGGGTT ATTGACCATT ATTGACCACT 1100 
CCCCTATTGG TGACGATACT TTCCATTACT AATCCATAAC ATGGCTCTTT 1150 

45 GCCACAACTA TCTCTATTGG CTATATGCCA ATACTCTGTC CTTCAGAGAC 1200 
TGACACGGAC TCTGTATTTT TACAGGATGG GGTCCCATTT ATTATTTACA 1250 
AATTCACATA TACAACAACG CCGTCCCCCG TGGCCGCAGT TTTTATTAAA 1300 
CATAGCGTGG GATCTCCACG CGAATCTCGG GTACGTGTTC CGGACATGGG 1350 
CTCTTCTCCG GTAGCGGCGG AGCTTCCACA TCCGAGCCCT GGTCCCATGC 1400 

50 CTCCAGCGGC TCATGGTCGC TCGGCAGCTC CTTGCTCCTA ACAGTGGAGG 14 50 
CCAGACTTAG GCACAGCACA ATGCCCACCA CCACCAGTGT GCCGCACAAG 1500 
GCCGTGGCGG TAGGGTATGT GTCTGAAAAT GAGCGTGGAG ATTGGGCTCG 1550 
CACGGCTGAC GCAGATGGAA GACTTAAGGC AGCGGCAGAA GAAGATGCAG 1600 
GCAGCTGAGT TGTTGTATTC TGATAAGAGT CAGAGGTAAC TCCCGTTGCG 1650 

55 GTGCTGTTAA CGGTGGAGGG CAGTGTAGTC TGAGCAGTAC TCGTTGCTGC 1700 
CGCGCGCGCC ACCAGACATA ATAGCTGACA GACTAACAGA CTGTTCCTTT 1750 
CCATGGGTCT TTTCTGCAGT CACCGTCGGA CCATGTGTGA ACTTGATATT 1800 
TTACATGATT CTCTTTACCA ATTCTGCCCC GAATTACACT TAAAACGACT 1850 
CAACAGCTTA ACGTTGGCTT GCCACGCATT ACTTGACTGT AAAACTCTCA 1900 

60 CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAGCGAG AACAAAACAT 1950 
AACATCAAAC GAATCGACCG ATTGTTAGGT AATCGTCACC TCCACAAAGA 2000 
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GCGACTCGCT GTATACCGTT GGCATGCTAG 
GATGCCCATT GTACTTGTTG ACTGGTCTGA 
TTATGGTATT GCGAGCTTCA GTCGCACTAC 
TATGAGAAAG CGTTCCCGCT TTCAGAGGAA 
5 CCAATTTCTA GCC0ACCTTG CGAGCATTCT 
TCATTGTCAG TGATGCTGGC TTtAAAGTGC 
AAGCTGGGTT GGTACTGGTT AAGTCGAGTA 
AGACCTAGGA GCGGAAAACT GGAAACCTAT 
CATCTAGTCA CTCAAAGACT TTAGGCTATA 

10 CCAATCTCAT GCCAAATTCT ATTGTATAAA 
AAATCAGCGC TCGACACGGA CTCATTGTCA 
ACTCAGCGTC GGCAAAGGAG CCATGGGTTC 
GAAATTCGAA CACCCAAACA ACTTGTTAAT 
GATTGAAGAA ACCTTCCGAG ACTTGAAAAG 

15 TACGCCATAG CCGAACGAGC AGCTCAGAGC 
ATCGCCCTGA TGCTTCAACT AACATGTTGG 
GAAACAAGGT TGGGACAAGC ACTTCCAGGC 
ACGTACTCTC AACAGTTCGC TTAGGCATGG 
TACACAATAA CAAGGGAAGA CTTACTCGTG 

20 AAATTTATTC ACACATGGTT ACGCTTTGGG 
GATCACTTCT GGCTAATAAA AGATCAGAGC 
TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT 
CCCTCCCCCG TGCCTTCCTT GACCCTGGAA 
TTCCTAATAA AATGAGGAAA TTGCATCGCA 

25 CTATTCTGGG GGGTGGGGTG GGGCAGCACA 
GACAATAGCA GGCATGCTGG GGATGCGGTG 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT 
TTGACCCGGT GACCAAAGGT GCCTTTTATC 

30 CAATTACTCA GTGCCTGTTA TAAGCAGCAA 
CATCACAACA AAAACTGATT TAACAAATGG 
TATTTGAACA TTATCTTGAT TATATTATTG 
CCTATCCAAG AAGTGATGCC TATCATTGGT 
TTAGCCTTGA ATACATTACT GGTAAGGTAA 

35 ATCCAAGAGA ACCAACTTAA AGCTTTCCTG 
GACCCTGAGC ACTGATGAAT CCCCTAATGA 
TTAAGGTGGA TACACATCTT GTCATATGAT 
CACTCATTAG GCACCCCAGG CTTTACACTT 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC 

40 CATGATTACG CCAAGCGCGC AATTAACCCT 
GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG 
AGGTCAGAAT GGTTTCTTTA CTGTTTGTCA 
AACAAAAGCT TCTATAACTG AAATATATTT 
TCCCTCGAAC CATGAACACT CCTCCAGCTG 

45 ATCTGCCAGG CTGGAAGATC ATGGAAGATC 
CATACCATAA ACTCATTTGG AATTGAGTAT 
TATGTTTTGC AGTTCCCTCA GAAGAAAAGC 
CCATCAAAAG ATATATTTAA ATATTCCAAC 
TCTTCACTCT GATCTCAGTT GGTTTCTTCA 

50 CCTATTTTGT CAAGAAAATA ATAGGTCAAG 
TGCCTAGCAT GGCTTAGATG CACGTTGTAC 
AACAGACTTC TGGTCTGTTA CAACAACCAT 
ATAATTGCTA ATTATGTTTT CCATCTCTAA 
TTAAGATCCC ATTATCTGGT TGTAACTGAA 

55 TATTTCTCAG TCTTTTCTCC AGCAATCCTG 
GAAAACACTT TGTTACCCAG AATTAAAAAC 
ATCCAAAATG GACCTATTGA AACTAAAATC 
ATTTCTATGG CGTCAAAGGT CAAACTTTTG 
CCCAATTCAG GCTATATATT CCCCAGGGCT 

60 CTCGTGCAGC AAGCATGGAA TTTTGCCTTG 
GTCCACCATG CCAATGACAA CATGCTCTAC 
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4950 


ATGTATTCAA 


GGAGCTCAAA 


5000 


TCCCCCTTTG 


CCATCTGTCA 


5050 
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ACTCTGGCCA TGGTCTCCCT GGGTGCAAAA GACAGCACCA . GGGAATTCGT 5100 
GCGCTCCTCC AAGAACGTCA TCAAGGAGTT CATGCGCTTC AAGGTGCGCA 515 0 
TGGAGGGCAC CGTGAACGGC CACGAGTTCG AGATCGAGGG CGAGGGCGAG 52 00 
GGCCX3CCCCT ACGAGGGCCA CAACACCGTG AAGCTGAAGG TGACCAAGGG 5250 
CGGCCCCCTG CCCTTCGCCT GGGACATCCT GTCCCCCCAG TTCCAGTACG 53 00 
GCTCCAAGGT GTACGTGAAG CACCCCGCCG ACATCCCCGA CTACAAGAAG 5350 
CTGTCCTTCC CCGAGGGCTT CAAGTGGGAG CGCGTGATGA ACTTCGAGGA 5400 
CGGCGGCGTG GTGACCGTGA CCCAGGACTC CTCCCTGCAG GACGGCTGCT 5450 
TCATCTACAA GGTGAAGTTC ATCGGCGTGA ACTTCCCCTC CGACGGCCCC 5500 
GTAATGCAGA AGAAGACCAT GGGCTGGGAG GCCTCCACCG AGC6CCTGTA 5550 
CCCCCGCGAC GGCGTGCTGA AGGGCGAGAT CCACAAGGCC CTGAAGCTGA 5600 
AGGACGGCGG CCACTACCTG GTGGAGTTCA AGTCCATCTA CATGGCCAAG 5650 
AAGCCCGTGC AGCTGCCCGG CTACTACTAC GTGGACTCCA AGCTGGACAT 5700 
CACCTCCCAC AACGAGGACT ACACCATCGT GGAGCAGTAC GAGCGCACCG 5750 
AGGGCCGCCA CCACCTGTTC CTGTAGCGGC CGCGACTCTA GATCATAATC 5800 
AGCCATACCA CATTTGTAGA GGTTTTACTT GCTTTAAAAA ACCTCCCACA 5850 
CCTCCCCCTG AACCTGAAAC ATAAAATGAA TGCAATTGTT GTTGTTAACT 5 900 
TGTTTATTGC AGCTTATAAT GGTTACAAAT AAAGCAATAG CATCACAAAT 5 95 0 
TTCACAAATA AAGCATTTTT TTCACTGCAT TCTAGTTGTG GCTCGAGAAG 6 000 
GGCGAATTCT GCAGATATCC ATCACACTGG CGGCCGCTCG AGGGGGGGCC 6 050 
CGGTACCCAA TTCGCCCTAT AGTGAGTCGT ATTACGCGCG CTCACTGGCC 6100 
GTCGTTTTAC AACGTCGTGA CTGGGAAAAC CCTGGCGTTA CCCAACTTAA 6150 
TCGCCTTGCA GCACATCCCC CTTTCGCCAG CTGGCGTAAT AGCGAAGAGG 6200 
CCCGCACCGA TCGCCCTTCC CAACAGTTGC GCAGCCTGAA TGGCGAATGG 6250 
AAATTGTAAG CGTTAATATT TTGTTAAAAT TCGCGTTAAA TTTTTGTTAA 63 00 
ATCAGCTCAT TTTTTAACCA ATAGGCCGAA ATCGGCAAAA TCCCTTATAA 6350 
ATCAAAAGAA TAGACCGAGA TAGGGTTGAG TGTTGTTCCA GTTTGGAACA 64 00 
AGAGTCCACT ATTAAAGAAC GTGGACTCCA ACGTCAAAGG GCGAAAAACC 6450 
GTCTATCAGG GCGATGGCCC ACTACTCCGG GATCATATGA CAAGATGTGT 6500 
ATCCACCTTA ACTTAATGAT TTTTACCAAA ATCATTAGGG GATTCATCAG 6550 
TGCTCAGGGT CAACGAGAAT TAACATTCCG TCAGGAAAGC TTATGATGAT 6600 
GATGTGCTTA AAAACTTACT C7VATGGCTGG TTATGCATAT CGCAATACAT 6650 
GCGAAAAACC TAAAAGAGCT TGCCGATAAA AAAGGCCAAT TTATTGCTAT 6700 
TTACCGCGGC TTTTTATTGA GCTTGAAAGA TAAATAAAAT AGATAGGTTT 6750 
TATTTGAAGC TAAATCTTCT TTATCGTAAA AAATGCCCTC TTGGGTTATC 6800 
AA6AGGGTCA TTATATTTCG CGGAATAACA TCATTTGQTG ACGAAATAAC 6850 
TAAGCACTTG TCTCCTGTTT ACTCCCCTGA GCTTGAGGGG TTAACATGAA 6900, 
GGTCATCGAT AGCAGGATAA TAATACAGTA AAACGCTAAA CCAATAATCC 6950 
AAATCCAGCC ATCCCAAATT GGTAGTGAAT GATTATAAAT AACAGCAAAC 7000 
AGTAATGGGC C7VATAACACC GGTTGCATTG GTAAGGCTCA CCAATAATCC 7050 
CTGTAAAGCA CCTTGCTGAT GACTCTTTGT TTGGATAGAC ATCACTCCCT 7100 
GTAATGCAGG TAAAGCGATC CCACCACCAG CCAATAAAAT TAAAACAGGG 715 0 
AAAACTAACC AACCTTCAGA TATAAACGCT AAAAAGGCAA ATGCACTACT 7 200 
ATCTGCAATA AATCCGAGCA GTACTGCCGT TTTTTCGCCC CATTTAGTGG 7250 
CTATTCTTCC TGCCACAAAG GCTTGGAATA CTGAGTGTAA AAGACCAAGA 7300 
CCCGCTAATG AAAAGCCAAC CATCATGCTA TTCCATCCAA AACGATTTTC 7350 
GGTAAATAGC ACCCACACCG TTGCGGGAAT TTGGCCTATC AATTGCGCTG 7400 
AAAAATAAAT AATCAACAAA ATGGCATCGT TTTAAATAAA GTGATGTATA 7450 
CCGAATTCAG CTTTTGTTCC CTTTAGTGAG GGTTAATTGC GCGCTXGGCG 7500 
TAATCATGGT CATAGCTGTT TCCTGTGTGA AATTGTTATC CGCTCACAAT 7550 
TCCACACAAC ATACGAGCCG GAAGCATAAA GTGTAAAGCC TGGGGTGCCT 7600 
AATGAGTGAG CTAACTCACA TTAATTGCGT TGCGCTCACT GCCCGCTTTC 7650 
CAGTCGGGAA ACCTGTCGTG CCAGCTGCAT TAATGAATCG GCCAACGCGC 7700 
GGGGAGAGGC GGTTTGCGTA TTGGGCGCTC TTCCGCTTCC TCGCTCACTG 7750 
ACTCGCTGCG CTCGGTCGTT CGGCTGCGGC GAGCGGTATC AGCTCACTCA 7800 
AAGGCGGTAA TACGGTTATC CACAGAATCA GGGGATAACG CAGGAAAGAA 7850 
CATGTGAGCA AAAGGCCAGC AAAAGGCCAG GAACCGTAAA AAGGCCGCGT 7900 
TGCTGGCGTT TTTCCATAGG CTCCGCCCCC CTGACGAGCA TCACAAAAAT 7950 
CGACGCTCAA GTCAGAGGTG GCGAAACCCG ACAGGACTAT AAAGATACCA 8000 
GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT CCGACCCTGC 8050 
CQCTTACCGG ATACCTGTCC GCCTTTCTCC CTTCGGGAAG CGTGGCGCTT 8100 
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TCTCATAGCT CACGCTGTAG GTATCTCAGT TCGGTGTAGG TCGTTCGCTC 8150 

CAAGCTGGGC TGTGTGCACG AACCCCCCGT TCAGCCCGAC CGCTGCGCCT 82 00 

TATCCGGTAA CTATCGTCTT GAGTCCAACC CGGTAAGACA CGACTTATCG 8250 

CCACTGGCAG CAGCCACTGG TAACAGGATT A6CAGAGCGA GGTATGTAGG 8300 

5 CGGTGCTACA GAGTTCTTGA AGTGGTGGCC TAACTACGGC TACACTAGAA 8350 

GGACAGTATT TGGTATCTGC GCTCTGCTGA- AGGCAGTTAC CTTCXK3AAAA 8400 • 

AGAGTTGGTA GCTCTTGATC CGGCAAACAA ACCACCGCTG GTAGCGGTGG 8450 

TTTTTTTGTT TGCAAGCAGC AGATTACGCG CAGAAAAAAA GGATCTCAAG 8500 

AAGATCCTTT GATCTTTTCT ACGGGGTCTG ACGCTCAGTG GAACGAAAAC 8550 

10 TCACGTTAAG GGATTTTGGT CATGAGATTA TCAAAAAGGA TCTTCACCTA 8600 

GATCCTTTTA AATTAAAAAT GAAGTTTTAA ATCAATCTAA AGTATATATG 8650 

AGTAAACTTG GTCTGACAGT TACCAATGCT TAATCAGTGA GGCACCTATC 8700 

TCAGCGATCT GTCTATTTCG TTCATCCATA GTTGCCTGAC TCCCCGTCGT 8750 

GTAGATAACT ACGATACGGG AGGGCTTACC ATCTGGCCCC AGTGCTGCAA 8800 

15 TGATACCGCG AGACCCACGC TCACCGGCTC CAGATTTATC AGCAATAAAC 88 50 

CAGCCAGCCG GAAGGGCCGA GCGCAGAAGT GGTCCTGCAA CTTTATCCGC 89 00 

CTCCATCCAG TCTATTAATT GTTGCCGGGA AGCTAGAGTA AGTAGTTCGC 89 50 

CAGTTAATAG TTTGCGCAAC GTTGTTGCCA TTGCTACAGG CATCGTGGTG 9000 

TCACGCTCGT CGTTTGGTAT GGCTTCATTC AGCTCCGGTT CCCAACGATC 9050 

20 AAGGCGAGTT ACATGATCCC CCATGTTGTG CAAAAAAGCG GTTAGCTCCT 9100 

TCGGTCCTCC GATCGTTGTC AGAAGTAAGT TGGCCGCAGT GTTATCACTC 9150 

ATGGTTATGG CAGCACTGCA TAATTCTCTT ACTGTCATGC CATCCGTAAG 92 00 

ATGCTTTTCT GTGACTGGTG AGTACTCAAC CAAGTCATTC TGAGAATAGT 9250 

GTATGCGGCG ACCGAGTTGC TCTTGCCCGG CGTCAATACG GGATAATACC 93 00 

25 GCGCCACATA GCAGAACTTT AAAAGTGCTC ATCATTGGAA AACGTTCTTC 93 50 

GGGGCGAAAA CTCTCAAGGA TCTTACCGCT GTTGAGATCC AGTTCGATGT 94 00 

AACCCACTCG TGCACCCAAC TGATCTTCAG CATCTTTTAC TTTCACCAGC 9450 

GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA AATGCCGCAA AAAAGGGAAT 9500 

AAGGGCGACA CGGAAATGTT GAATACTCAT ACTCTTCCTT TTTCAATATT 9550 

30 ATTGAAGCAT TTATCAGGGT TATTGTCTCA TGAGCGGATA CATATTTGAA 9600 

TGTATTTAGA AAAATAAACA AATAGGGGTT CCGCGCACAT TTCCCCGAAA 9650 

AGTGCCAC 9658 



35 SEQ ID NO: 5 (spacer) 
(GPGG) X 

SEQ ID NO: 6 (spacer) 
GPGGGPGGGPGG 

40 

SEQ ID NO: 7 (spacer) 
GGGGSGGGGSGGGGS 

45 

SEQ ID NO: 8 (spacer) 
GGGGSGGGGSGGGGSGGGGS 



50 SEQ ID NO: 9 (enterokinase cleavage site) 
DDDDK 



SEQ ID NO: 10 (altered transposase Hef forward primer) 
55 ATCTCGAGACCATGTGTGAACTT6ATATTTTACATGATTCTCTTTACC 



SEQ ID NO: 11 (altered transposase Her reverse primer) 
GATTGATCATTATCATAATTTCCCCAAAGCGTAACC 

60 
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SEQ ID NO: 12 (Xho I restriction site) 
CTCGAG 



5 SEQ ID NO: 13 (modified Kozak sequence) 
ACCATG 



SEQ ID NO: 14 (Bcl I restriction site) 
10 TGATCA 



SEQ ID N0:X5 (CMVf-NgoM IV primer) 
TTGCCGGCATCAGATTGGCTAT 

15 

SEQ ID NO: 16 ( Syn -polyAr-BstE II primer) 
AGAGGTCACCGGGTCAATTCTTCAGCACCTGGTA 

20 

SEQ ID NO: 17 (vitellogenin promoter) 

TGAATGTGTT CTTGTGTTAT CAATATAAAT CACAGTTAGT GATGAAGTTG 
TGCATCAGTT CAGCTACTTG GCTGCATTTT GTATTTGGTT CTGTAGGAAA 

25 TCTAGGCTGA CCTGCACTTC TATCCCTCTT GCCTTACTGC TGAGAATCTC 
AATTGTTCAC ATTTTGCTCC CATTTACTTT GGAAGATAAA ATATTTACAG 
AAACCTTTGT TCATTTAAAA ATATTCCTGG TCAGCGTGAC CGGAGCTGAA 
GATCCCGTGA TTTCAATAAA TACATATGTT CCATATATTG TTTCTCAGTA 
TCATGTGCGT TGGTGCACAT ATGAATACAT GAATAGCAAA GGTTTATCTG 

30 TGGCCTGCAG GAATGGCCAT AAACCAAAGC TGAGGGAAGA GGGAGAGTAT 
GATTATACTG ATTGCTGATT GGGTTATTAT CAGCTAGATA ACAACTTGGG 
GGTCAACATA ACCTGGGCAA AACCAGTCTC ATCTGTGGCA GGACCATGTA 
GCCGTGACCC AATCTAGGAA AGCAAGTAGC ACATCAATTT TAAATTTATT 
TAGTAGAAGT GTTTTACTGT GATACATTGA AACTTCTGGT CAATCAGAAA 

35 ATCAGAGATG CCAAGGTATT ATTTGATTTT CTTTATTCGC CGTGAAGAGA 
GCAAAAAGAG GAGTGTTTAC ATAAACTGAT AAAAAACTTG AGGAATTCAG 
CCACGTGTTC CTGAACATTC TTCCATAAAA GTCTCACCAT QCCTGGCAGA 
CCTTCGCT 

40 

SEQ ID NO: 18 (vitellogenin targeting sequence) 
ATGAGGGGGATCATACTGGCATTAGTGCTCACCCTTGTAGGCAGCCAGAAGTTTGACATTGGT 

45 

SEQ ID NO: 19 (pl46 protein) 
KYKKALKKLAKLL 



50 SEQ ID NO: 20 (pl46 coding sequence) 

AAATACAAAAAAGCACTGAAAAAACTGGCAAAACTGCTG 



SEQ ID NO: 21 (pro- insulin sequence) 
55 TTTGTGAACCAACACCTGTGCGGCTCACACCTGGTGGAAGCTCTCTACCTAGTGTGCGGGGAACGAGGC 
TTCTTCTACACACCCAAGACCCKSCCGGGAGGCAGAGGACCTGCrAGGTGGGGCAGGTGGAGCTGGGCGGG 
GGCCCTGGTGCAGGCAGCCTGCAGCCCTTGGCCCTGGAGGGGTCCCTGCAGAAGCGTGGCATTGTGGAA 
CAATGCTGTACCAGCATCTGCTCCCTCTACCyVGCrrGGAGAACTCrrGCAACTAG 

60 

SEQ ID NO: 22 (TAG sequence) 
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Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp 
Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Thr Thr Cys lie Leu Lys Gly Ser Cys 
Gly Trp He Gly Leu Leu Asp Asp Asp Asp Lys 



5 SEQ ID NO: 23 (gp41 epitope) 

Ala Thr Thr Cys He Leu Lys Gly Ser Cys Gly Trp He Gly Leu Leu 



SEQ ID NO: 24 (polynucleotide sequence encoding gp41 epitope) 

10 Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Thr Thr Cys He Leu Lys Gly 
Ser Cys Gly Trp He Gly Leu Leu Asp Asp Asp Asp Lys 

SEQ ID NO: 25 (repeat domain in TAG spacer sequence) 
IS Pro Ala Asp Asp Ala 



SEQ ID NO: 26 (TAG Spacer sequence) 

Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp 
20 Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp 

SEQ ID NO: 27 (Vit pro/Vit targ/TAG/pro-insul in/ synthetic polyA) 
TGAATGTGTT CTTGTGTTAT CAATATAAAT CACAGTTAGT GATGAAGTTG GCTGCAAGCC 
TGCATCAGTT CAGCTACTTG GCTGCATTTT GTATTTGGTT CTGTA6GAAA TGCAAAAGGT 

25 TCTAGGCTGA CCTGCACTTC TATCCCTCTT GCCTTACTGC TGAGAATCTC TGCAGGTTTT 
AATTGTTCAC ATTTTGCTCC CATTTACTTT GGAAGATAAA ATATTTACAG AATGCTTATG 
AAACCTTTGT TCATTTAAAA ATATTCCTGG TCAGCGTGAC CGGAGCTGAA AGAACACATT 
GATCCCGTGA TTTCAATAAA TACATATGTT CCATATATTG TTTCTCAGTA GCCTCTTAAA 
TCATGTGCGT TGGTGCACAT ATGAATACAT GAATAGCAAA GGTTTATCTG GATTACGCTC 

30 TGGCCTGCAG GAATGGCCAT AAACCAAAGC TGAGGGAA6A GGGAGAGTAT AGTCAATGTA 
GATTATACTG ATTGCTGATT GGGTTATTAT CAGCTAGATA ACAACTTGGG TCAGGTGCCA 
GGTCAACATA ACCTGGGCAA AACCAGTCTC ATCTGTGGCA GGACCIATGTA CCAGCAGCCA 
GCCGTGACCC AATCTAGGAA AGCAAGTAGC ACATCAATTT TAAATTTATT GTAAATGCCG 
TAGTAGAAGT GTTTTACTGT GATACATTGA AACTTCTGGT CAATCAGAAA AAGGTTTTTT 

35 ATCAGAGATG CCAAGGTATT ATTTGATTTT CTTTATTCGC CGTGAAGAGA ATTTATGATT 
GCAAAAAGAG GAGTGTTTAC ATAAACTGAT AAAAAACTTG AGGAATTCAG CAGAAAACAG 
CCACGTGTTC CTGAACATTC TTCCATAAAA GTCTCACCAT GCCTGGCAGA GCCCTATTCA 
CCTTCGCTAT GAGGGGGATC ATACTGGCAT TAGTGCTCAC CCTTGTAGGC AGCCAGAAGT 
TTGACATTGG TAGACTGAGA ATGGCAAGAA GAATGAGAAGA TGGTTTGTG AACCAACACC 

40 TGTGCGGCn'CA CACCTGGTGG AAGCTCTCTA CCTAGTGTGCG GGGAACGAGG CTTCTTCTAC 
ACACCCAAGA CCCGCCGGGA GGCAGAGGAC CTGCAGGTGGG GCAGGTGGAG CTGGGCGGGG 
GCCCTGGTGC AGGCAGCCTG CAGCCCTTGG CCCTGGAGGGG TCCCTGCAGA AGCGTGGCAT 
TGTGGAACAA TGCTGTACCA GCATCTGCTC CCTCTACCAGC TGGAGAACTA CTGCAACTAG 
GGCGCCTGGATCCAGATCACTTCTGGCTAATAAAAGATCAGAGCTCTAGAGATCTGTGTGTTGGTTTTT 

45 TGTGGATCTGCIOTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTT^ 

CTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGG 
TGTCATTCTATTCTGG6GGGTGGGGTGGGGCAGCACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGG 
CATGCTGGGGATGCGGTGGGCTCTATGGGTAC(n*CrrCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC 
TCTCGGTACCTCTCTC 

50 

SEQ ID NO: 28 (synthetic polyA sequence) 

gg(:xk:ctggatccagatcacttctggctaataaaagatcagagctctagagatctgtgtgttggttttt 
tgtggatctgctgtgccttctagttgccagccatctgttgtttgcccctcccccgtgccttccttgacc 
ctggaaggtgccactcccactgtcctttcctaataaaatgaggaaattgcatcgcattgtctgagtagg 
55 tgtcattctattctggggggtggggtggggcagcacagcaagggggaggattgggaagacaatagcagg 
catgctggggatgcggtgggctctatgggtacctctctctctctctctctctctctctctctctctctc 
tctcggtacctctctc 
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SEQ ID NO:29 {pTnMod (Oval/ENT tag/P146/PA) - 

CTGACGCGCC CTGTAGCGGC GCATTAAGCG CX5GCGGGTGT 
5 CGCAGCGTGA CCGCTACACT TGCCAGCGCG CTAGCGCCCG 
TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCAtCAGA 
GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT 
CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA 

10 CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG 
CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT 
GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT 
CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC 
TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA 

15 TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA 
CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA 
ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA 
TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA 
CCCATTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG 

20 GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG 
TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC 
GGAACGGTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT 
CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT 
CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTTCCTTAT 

25 ATGGTATAGC TTAGCCTATA GGTGTGGGTT ATTGACCATT 
CCCCTATTGG TGACGATACT TTCCATTACT AATCCATAAC 
GCCACAACTA TCTCTATTGG CTATATGCCA ATACTCTGTC 
TGACACGGAC TCTGTATTTT TACAGGATGG GGTCCCATTT 
AATTCACATA TACAACAACG CCGTCCCCCG TGCCCGCAGT 

30 CATAGCGTGG GATCTCCACG CGAATCTCGG GTACGTGTTC 
CTCTTCTCCG GTAGCGGCGG AGCTTCCACA TCCGAGCCCT 
CTCCAGCGGC TCATGGTCGC TCGGCAGCTC CTTGCTCCTA 
CCAGACTTAG GCACAGCACA ATGCCCACCA CCACCAGTGT 
GCCGTG6CGG TAGGGTATGT GTCTGAAAAT GAGCGTG6AG 

35 CACGGCTGAC GCAGATGGAA GACTTAAGGC AGCGGCAGAA 
GCAGCTGAGT TGTTGTATTC TGATAAGAGT CAGAGGTAAC 
. GTGCTGTTAA CGGTGGAGGG CAGTGTAGTC TGAGCAGTAC 
CGCGCGCGCC ACCAGACATA ATAGCTGACA GACTAACAGA 
CCATGGGTCT TTTCTGCAGT CACCGTCGGA CCATGTGTGA 

40 TTACATGATT CTCTTTACCA ATTCTGCCCC GAATTACACT 
CAACAGCTTA ACGTTGGCTT GCCACGCATT ACTTGACTGT 
CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAGCGAG 
AACATCAAAC GAATCGACCG ATTGTTAGGT AATCGTCACC 
GCGACTCGCT GTATACCGTT GGCATGCTAG CTTTATCTGT 

45 GATGCCCATT GTACTTGTTG ACTGGTCTGA TATTCGTGAG 
TTATGGTATT GCGAGCTTCA GTCGCACTAC ACGGTCGTTC 
TATGAGAAAG CGTTCCCGCT TTCAGAGCAA TGTTCAAAGA 
CCAATTTCTA GCCGACCTTG CGAGCATTCT ACCGAGTAAC 
TCATTGTCAG TGATGCTGGC TTTAAAGTGC CATGGTATAA 

50 AAGCTGGGTT GGTACTGGTT AAGTCGAGTA AGAGGAAAAG 
AGACCTAGGA GCGGAAAACT GGAAACCTAT CAGCAACTTA 
CATCTAGTCA CTCAAAGACT TTAGGCTATA AGAGGCTGAC 
CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTCGCTCTA 
7VAATCAGCGC TCGACACGGA CTCATTGTCA CCACCCGTCA 

55 ACTCAGCGTC GGCAAAGGAG CCATGGGTTC TAGCAACTAA 
GAAATTCGAA CACCCAAACA ACTTGTTAAT ATCTATTCGA 
GATTGAAGAA ACCTTCCGAG ACTTGAAAAG TCCTGCCTAC 
TACGCCATAG CCGAACGAGC AGCTCAGAGC GTTTTGATAT 
ATCGCCCTGA TGCTTCAACT AACATGTTGG CTTGCGGGCG 

60 GAAACAAGGT TGGGACAAGC ACTTCCAGGC TAACACAGTC 
ACGTACTCTC AACAGTTCGC TTAGGCATGG AAGTTTTGCG 
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Chicken) 

GGTGGTTACG 50 
CTCCTTTCGC 100 
TTGGCTATTG 150 
TATATTGGCT 200 
TAGTTATTAA 250 
TGGAGTTCCG 300 
CCCAACGACC 350 
AACGCCAATA 400 
AAACTGCCCA 450 
CCTATTGACG 500 
CATGACCTTA 5 50 
TCGCTATTAC 6 00 
TAGCGGTTTG 650 
TGGGAGTTTG 7 00 
ACAACTCCGC 7 50 
GTCTATATAA 800 
CCATCCACGC 850 
TCCGCGGCCG 900 
GACGTAAGTA 950 
GCATGCTATA 1000 
GCTATAGGTG 1050 
ATTGACCACT 1100 
ATGGCTCTTT 1150 
CTTCAGAGAC 1200 
ATTATTTACA 1250 
TTTTATTAAA 1300 
CGGACATGGG 1350 
GGTCCCATGC 1400 
ACAGTGGAGG 1450 
GCCGCACAAG 1500 
ATT6GGCTCG 1550 
GAAGATGCAG 1600 
TCCCGTTGCG 1650 
TCGTTGCTGC 1700 
CTGTTCCTTT 1750 
ACTTGATATT 1800 
TAAAACGACT 1850 
AAAACTCTCA 1900 
AACAAAACAT 1950 
TCCACAAAGA 2 000 
TCGGGAATAC 2 050 
CAAAAACGAC 2100 
TGTTACTCTT 2150 
AAGCTCATGA 2200 
ACCACACCGC 2250 
ATCCGTTGAG 2300 
TACAATATGC 2350 
CATGATATGT 2400 
TAAAAGCAAT 2450 
AAGGCCGAAA 2500 
CCTAAAATCT 2550 
CTTACCTGTT 2600 
AGCGAATGCA 2650 
GGACTAGGCC 2700 
CATGCTGCTA 2750 
TTCATGCTCA 2800 
AGAAATCGAA 2850 
GCATTCTGGC 2900 
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• TACACAATAA CAAGGGAAGA CTTACTCGTG 
AAATTTATTC ACACATGGTT ACGCTTTGGG 
GATCACTTCT GGCTAATAAA AGATCAGAGC 
TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT 

• 5 CCCTCCCCCG TGCCTTCCTT GACCCTGGAA 
TTCCTAATAA AATGAGGAAA TTGCATCGCA 
CTATTCTGGG GGGTGGGGTG GGGCAGCACA 
GACAATAGCA GGCATGCTGG GGATGCGGTG 
CTCTCTCTCrr CTCTCTCTCT CTCTCTCTCT 

10 CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT 
TTGACCCGGT GACCAAAGGT GCCTTTTATC 
CAATTACTCA GTGCCTGTTA TAAGCAGCAA 
CATCACAACA AAAACTGATT TAACAAATGG 
TATTTGAACA TTATCTTGAT TATATTATTG 

15 CCTATCCAAG AAGTGATGCC TATCATTGGT 
TTAGCCTTGA ATACATTACT GGTAAGGTAA 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG 
GACCCTGAGC ACTGATGAAT CCCCTAATGA 
TTAAGGTGGA TACACATCTT GTCATATGAT 

20 CACTCATTAG GCACCCCAGG CTTTACACTT 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC 
CATGATTACG CCAAGCGCGC AATTAACCCT 
GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG 
AGGTCAGAAT GGTTTCTTTA CTGTTTGTCA 

25 AACAATAGCT TCTATAACTG AAATATATTT 
TCCCTCGAAC CATGAACACT CCTCCAGCTG 
ATCTGCCAGG CCATTAAGTT ATTCATGGAA 
AGTTCATATC ATAAACACAT TTGAAATTGA 
GAGCTATGTT TTGCTGTATC CTCAGAAAAA 

30 ACACCCATAA AAAGATAGAT TTAAATATTC 
CGTCTGCTCT TCACTCTAGT CTCAGTTGGC 
TTATTTCTCC TATTTTGTCA AGAAAATAAT 
TTATGTCCTG CCTAGCATGG CTCAGATGCA 
TCAAATGAAA CAGACTTCTG GTCTGTTACT 

35 ACTAACTAAT AATTGCTAAT TATGTTTTCC 
TTTCTGTTTT CTTAAAGATC CCATTATCTG 
GAACATGAGC AATATTTCCC AGTC^TTCTCT 
GATTAGCAGA ACAGGCAGAA AACACATTGT 
TATTTGCTCT CCATTCAATC CAAAATGGAC 

40 CCCAATCCCA TTAAATGATT TCTATGGCGT 
GGAACCTGTG GGTGGGTCAC AATTCAGGCT 
CGGATCCATG GGCTCCATCG GCGCAGCAAG 
TATTCAAGGA GCTCAAAGTC CACCATGCCA 
CCCATTGCCA TCATGTCAGC TCTAGCCATG 

45 CAGCACCAGG ACACAGATAA ATAAGGTTGT 
GATTCGGAGA CAGTATTGAA GCTCAGTGTG 
TCTTCACTTA GAGACATCCT CAACCAAATC 
TTCGTTCAGC CTTGCCAGTA GACTTTATGC 
TGCCAGAATA CTTGCAGTGT GTGAAGGAAC 

50 CCTATCAACT TTCAAACAGC TGCAGATCAA 
CTGGGTAGAA AGTCAGACAA ATGGAATTAT 
GCTCCGTGGA TTCTCAAACT GCAATGGTTC 
AAAGGACTGT GGGAGAAAAC ATTTAAGGAT 
TTTCAGAGTG ACTGAGCAAG AAAGCAAACC 

55 TTGGTTTATT TAGAGTGGCA TCAATGGCTT 
GAGCTTCCAT TTGCCAGTGG GACAATGAGC 
TGAAGTCTCA GGCCTTGAGC AGCTTGAGAG 
TGACTGAATG GACCAGTTCT AATGTTATGG 
TACTTACCTC GCATGAAGAT GGAGGAAAAA 

60 AATGGCTATG GGCATTACTG ACGTGTTTAG 
6CATCTCCTC AGCAGAGAGC CTGAAGATAT 
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GCTGCAACCC 


TACTAGCTCA 


2950 • 


GAAATTATGA 


TAATGATCCA 


3000 


TCTAGAGATC 


TGTGTGTTGG 


3050 


GCCAGCCATC 


TGTTGTTTGC 


3100 


GGTGCCACTC 


CCACTGTCCT 


3150 


TTGTCTGAGT 


AGGTGTCATT' 


3200 

m0 mm W V 


GCAAGGGGGA 


GGATTGG6AA 


3250 

vJ* Ai ^ W 


GGCTCTATGG 


GTACCTCTCT 


3300 


CTCTCGGTAC 


CTCTCTCTCT 

X X X X X 


3350 


CGGTACCAGG 


TGCTGAAGAA 


3400 


ATCACTTTAA 


AAATAAAAAA 


3450 


TTAATTATGA 


TTGATGCCTA 

lA A Al l^^^^^i^ A mm 


3500 


TTGGTCTGCC 


TTAGAAAGTA 


3550 


ATAATAATAA 


AAACCTTATC 


3600 


TGGAATGAAC 


TTGAAAAAAA 


3650 


ACGCCATTGT 


CAGCAAATTG 


3700 


ACGGAATGTT 


AATTCTCGTT 

X X V« X ^ \J X X 


3 7S0 


X X X X vjo X j^j^t^y 


AATPATTAAfi 


fl no 

J O w V/ 


v. X rvn X 


TfiAnTTAfSPT 

X VJ/~VU X X nVJ^ X 


J a Z3 v 




riPTPHT ATHT 
o^x^VjIMIk^I 


1 Q n n 

^ J7 u u 


Ar'aPAnrjA A A 


PAfZPTATPAP 


1 Q n 
J ^ o u 




AAPA AA ACPT 


*i U U U 


A APTAnTnnA 




4 n n 

•4 U 3 U 


ATTPTATTAT 

^ X X ^ X X X t\ X 


TTPAATAPAf; 


4 1 no 


GCTATTRTAT 

VJ X r\ X X X X 


ATTATGATTC5 

X X X vj«^ x x v.^ 


^ X V 


AATTTCACAA 


TTPPTPTGTP 

X X V* V« X Xa* X X \^ 


4 2 00 


GATCTTTGAG 


GAACACTGCA 


4250 

~ A» W W 


GTATTGTTTT 


G C ATTGT ATG 


43 00 


AAGTTT(3TTA 


TAAAG C ATTC 


4 350 


CAGCTATAGG 


AAAGAAAGTG 


4400 


TCCTTCACAT 


GPATGPTTPT 

S3^n X X X w X 


44 50 

T T 3 V 


AGGTCACGTC 


TTflTTPTPAP 

X X w X X V« X \a«/%^ 


4 500 

*t ^ V u 


CGTTGTA{3AT 


ACAAGAAGGA 


4550 


ACAACCATAG 


TAATAAGCAC 


4600 


ATCTCTAAGG 


X X^^^^»W«%X X 


^ w 9 V 


GTTGTAACTG 


AAGCTCAATG 


4700 


CCCATCCAAC 


AGTCCTGATG 


4750 


TACCCAGAAT 


TAAAAACTAA 


4800 


CTATTGAAAC 


TAAAATCTAA 


4850 


CAAAGGTCAA 


ACTTCTGAAG 

X X W X ^J«^A^W 


4900 


ATATATTCCC 


CAGGGCTCAG 


4 950 


CATGGAATTT 


TRTTTTGATG 

X w X X X X sjn X V7 


5000 


ATGAGAACAT 


CTTCTAPTGP 

W X X ^ X X1.V_> X VulVo 


5050 

^ W ^ V 


GTATACCTGG 


GTGPAA AAGA 


5 1 no 


TPGPTTT'G AT 


AAAPTTPPAG 






AAAPGTTPAP 


O 4& u u 




A 'PfZ A TGTT'P A 
n X \jr\ X V9 X X Xn 




TGAAGAGAGA 


TAPPPAATPP 




TGTATAGAGG 


AGG CTTGGAA 


5350 

3 V 3 w 


G CCAG AG AG C 


T P ATP AATTP 
X X \^n^\ X X ^ 


54 00 


CAGAAATGTC 


CTTCAQCCAA 


54 50 


TGGTTAATG C 

X X X X \J ^ 


\arn X X \J X ^ X X ^ 


5500 


GAAGACACAC 


AAGCAATGPP 


5550 


TGTGCAGATG 


ATGTACCAGA 


5600 


CTGAGAAAAT 


GAAGATCCTG 


5650 


ATGTTGGTGC 


TGTTGCCTGA 


5700 


TATAATCAAC 


TTTGAAAAAC 


5750 


AAGAGAGGAA 


GATCAAAGTG 


5800 


TACAACCTCA 


CATCTGTCTT 


5850 


CTCTTCAGCC 


AATCTGTCTG 


5900 


CTCAAGCTGT 


CCATGCAGCA 


5950 
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CATGCAGAAA TCAATGAAGC AGGCAGAGAG 
TGGAGTGGAT GCTGCAAGCG TCTCTGAAGA 
TCCTCTTCTG TATCAAGCAC ATCGCAACCA 
.AGATGTGTTT CCCCTCCGCG GCCAGCAGAT 
5 ACCAGCAGAT GACGCACCAG CAGATGACGC 
• CAGATGACGC AACAACATGT ATCCTGAAAG 
CTGCTGGATG ACGATGACAA AAAATACAAA 
AAAACTGCTG TAATGAGGGC GCCTGGATCC 
AAGATCAGAG CTCTAGAGAT CTGTGTGTTG 

10 GCCTTCTAGT TGCCAGCCAT CTGTTGTTTG 
TGACCCTGGA AGGTGCCACT CCCACTGTCC 
ATTGCATCGC ATTGTCTGAG TAGGTGTCAT 
GGGGCAGCAC AGCAAGGGGG AGGATTGGOA 
GGGATGCGGT GGGCTCTATG GGTACCTCTC 

15 TCTCTCTCTC TCTCTCGGTA CCTCTCTCGA 
TCGCCCTATA GTGAGTCGTA TTACGCGCGC 
ACGTCGTGAC TGGGAAAACC CTGGCGTTAC 
CACATCCCCC TTTCGCCAGC TGGCGTAATA 
CGCCCTTCCC AACAGTTGCG CAGCCTGAAT 

20 GTTAATATTT TGTTAAAATT CGCGTTAAAT 
TTTTAACCAA TAGGCCGAAA TCGGCAAAAT 
AGACCGAGAT AGGGTTGAGT GTTGTTCCAG 
TTAAAGAACG TGGACTCCAA CGTCAAAGGG 
CGATGGCCCA CTACTCCGGG ATCATATGAC 

25 CTTAATGATT TTTACCAAAA TCATTAGGGG 
AACGAGAATT AACATTCCGT CAGGAAAGCT 
AAACTTACTC AATGGCTGGT TATGCATATC 
AAAAGAGCTT GCCGATAAAA AAGGCCAATT 
TTTTATTGAG CTTGAAAGAT AAATAAAATA 

30 AAATCTTCTT TATCGTAAAA AATGCCCTCT 
TATATTTCGC GGAATAACAT CATTTGGTGA 
CTCCTGTTTA CTCCCCTGAG CTTGAGGGGT 
GCAGGATAAT AATACAGTAA AACGCTAAAC 
TCCCAAATTG GTAGTGAATG ATTATAAATA 

35 AATAACACCG GTTGCATTGG TAAGGCTCAC 
CTTGCTGATG ACTCTTT6TT TGGATAGACA 
AAAGCGATCC CACCACCAGC CAATAAAATT 
ACCTTCAGAT ATAAACGCTA AAAAGGCAAA 
ATCCGAGCAG TACTGCCGTT TTTTCGCCCC 

40 GCCACAAAGG CTTGGAATAC TGAGTGTAAA 
AAAGCCAACC ATCATGCTAT TCCATCCAAA 
CCCACACCGT TGCGGGAATT TGGCCTATCA 
ATCAACAAAA TGGCATCGTT TTAAATAAAG 
TTTTGTTCCC TTTAGTGAGG GTTAATTGCG 

45 ATAGCTGTTT CCTGTGTGAA ATTGTTATCC 
TACGAGCCGG AAGCATAAAG TGTAAAGCCT 
TAACTCACAT TAATTGCGTT GCGCTCACTG 
CCTGTCGTGC CAGCTGCATT AATGAATCGG 
GTTTGCGTAT TGGGCGCTCT TCCGCTTCCT 

50 TCGGTCGTTC GGCTGCGGCG AGCGGTATCA 
ACGGTTATCC ACAGAATCAG GGGATAACGC 
AAGGCCAGCA AAAGGCCAGG AACCGTAAAA 
TTCCATAGGC TCCGCCCCCC TGACGAGCAT 
TCAGAGGTGG CGAAACCCGA CAGGACTATA 

55 CTGGAAGCTC CCTCGTGCGC TCTCCTGTTC 
TACCTGTCCG CCTTTCTCCC TTCGGGAAGC 
ACGCTGTAGG TATCTCAGTT CGGTGTAGGT 
GTGTGCACGA ACCCCCCGTT CAGCCCGACC 
TATCGTCTTG AGTCCAACCC GGTAAGACAC 

60 AGCCACTGGT AACAGGATTA GCAGAGCGAG 
AGTTCTTGAA GTGGTGGCCT AACTACGGCT 
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CAGCAGAGGC 


6000 ' 

W V W w 


ATTTAGGGCT 


GACCATCCAT 


6050 


ACGCGGTTCT 


CTTCTTTGGC 


6100 




CAGATGACGC 


6150 

V/ X w v 




GACGCACCAG 

WW Wfl W Wr»W 


6200 

V X* w w 




CTGGATCGGC 

W X ^^\7c^ X W w W W 


6350 - 

U A ^ V 


AAAGP A CTR A 


AAAAACTGGC 


O J u u 


ARATPAf*TTC 


TGGCTAATAA 


61<50 

O J J u 


w X X * 4 X A W X w 


GATCTGCTGT 


6400 


cccc^ccccc 


GTGCCTTCCT 

x^ X ^^^rfX* X X ^^^^ X 


6450 


TTTCCTAATA 


AAATGAGGAA 


6500 

W 4^ V W 


TCTATTCTGG 


GGGGTGGGGT 


6550 

W V w w 


AGACAATAGC 


AGGCATGCTG 


6600 

W w w w 


TCTCTCTCTC 


TCTCTCTCTC 


6650 


GGGGGGGCCC 


GGTACCCAAT 


6700 


TCACTGGCCG 


TCGTTTTACA 


6750 


CCAACTTAAT 


CGCCTTGCAG 


6800 

W W W w 


GCGAAGAGGC 


CCGCACCGAT 

W V..»\^W/AW WVJ^^ X 


\J \J ,J \J 


GGCGAATGGA 


AATTGTAAGC 


6900 

w «/ W \J 


TTTTGTTAAA 


X W«>\JW X W«» X X 


w J ^ U 


CCCTTATAAA 

Vm» X X X •fc*V<X 


TCAAAAGAAT 


7000 

• www 


TTTGGAACAA 


GAGTCCACTA 


7050 


CGAAAAACCG 

\^xl_c^x^L^x^/x \^ \J 


TCTATCAGGG 

X W X *\ X K^'TWJyjsJ 


7100 

/ X w w 


AAGATGTGTA 


TCCACCTTAA 

X WNapxkW W X X 


7150 


ATTCATCAGT 

X X X ^^V\J X 


GCTCAGGGTr 


7 ? 0 0 

/ X* W w 


TATGATGATG 


ATGTGPTTAA 

f \ X U X X X 


7 9 0 
/ *i J \j 


nr AATAPATf? 


PGAAAAAPPT 


f J W w 


1 n A X V7V.> X n X X 


TAPPGPGGPT 


r J 9 U 


un X rluu X X X X 


Zi TTT*^^ i A f2PT 
n X X 1 OrvvVj\_ 1 


/ 1 W w 










a afiPAPTTflT 


7 <R fin 


TAAP&TfSAAn 
1 An^n X 


nTP aTPflAT A 


7 ^ tin 




AATPPAflPPA 


7fiOO 
/ D Uvl 


APAHPAJVAPA 


OTAATRfiGPP 


r O 9 V 


PAATAATPPP 


TGTAAAGPAP 


7700 

/ / WW 


TPACTCPCTG 


TAATGCAGGT 
X X w w^i w X 


7750 

/ / ^ w 


AAAACAGGGA 


AAACTAACCA 


7800 

f w w w 


TG CACT ACTA 


TCTGCAATAA 


7850 


ATTTAGTGGC 


TATTCTTCCT 


7900 


AGACCAAGAC 

«^\J«*WW^^'^^J'^W 


CCGCTAATGA 


7950 


ACGATTTTCG 

la WW«* X X X A 


GTAAATAGCA 


8000 

w w w w 


ATTGCGCTGA 


AAAATAAATA 


8050 


TGATGTATAC 

X X VJ X X 1^ w 


CGAATTCAGC 

W^7<^J^ X X Wi^xJ w 


8100 

w X w w 


CGCTTGGCGT 


AATCATGGTC 


8150 

W X «J w 


GCT CACAATT 

\JVv X V^J^ Was* X X 


CCACACAACA 


8:200 


GGGGTGCCTA 

X ^7 ^^^^ X MM 


ATGAGTGAGC 


8250 


CCCGCTTTCC 


AGTCGGGAAA 


8300 

W W w V 


CCAACGCGCG 


GGGAGAGGCG 


8350 

W W V w 


CGCTCACTGA 


CTCGCTGCGC 


8400 


GCTCACTCAA 


AGGCGGTAAT 


8450 


AGGAAAGAAC 


ATGTG AG CAA 


B500 


AGGCCGCGTT 


GCTGGCGTTT 

^JW X Ww W\J XXX 


8550 


CACAAAAATP 


GACGCTCAAG 

V7^^W wW X Wf^^^VJ 


8600 


AAGATACCAG 


GCGTTTCCCC 


8650 


CGACCCTGCC 


GCTTACCGGA 


8700 


GTQGCGCTTT 


CTCATAGCTC 


8750 


CGTTCGCTCC 


AAGCTGGGCT 


8800 


GCTGCGCCTT 


ATCCGGTAAC 


8850 


GACTTATCGC 


CACTGGCAGC 


8900 


GTATGTAGGC 


GGTGCTACAG 


8950 


ACACTAGAAG 


GACAGTATTT 


9000 
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GGTATCTGCG CTCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG 9050 
CTCTTGATCC GGCAAACAAA CCACCGCTGG TAGCGGTGGT TTTTTTGTTT 9100 
GCAAGCAGCA GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG 9150 
ATCTTTTCTA CGGGGTCTGA CGCTCAGTGG AACGAAAACT CACGTTAAGG 9200 
5 GATTTTGGTC ATGAGATTAT CAAAAAGGAT CTTCACCTAG ATCCTTTTAA 9250 
ATfAAAAATG AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG 93 00 
TCTGACAGTT ACCAATGCTT AATCAGTGAG GCACCTATCT CAGCGATCTG 9350 
TCTATTTCGT TCATCCATAG TTGCCTGACT CCCCGTCGTG TAGATAACTA 94 00 
CGATACGGGA G6GCTTACCA TCTGGCCCCA GTGCTGCAAT GATACCGCGA 9450 

10 GACCCACGCT CACCGGCTCC AGATTTATCA GCAATAAACC AGCCAGCCGG 9500 
AAGGGCCGAG C6CAGAAGTG GTCCTGCAAC TTTATCCGCC TCCATCCAGT 9550 
CTATTAATTG TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATAGT 9600 
TTGCGCAACG TTGTTGCCAT TGCTACAGGC ATCGTGGTGT CACGCTCGTC 9650 
GTTTGGTATG GCTTCATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA 9700 

15 CATGATCCCC CATGTTGTGC AAAAAAGCGG TTAGCTCCTT CGGTCCTCCG 97 50 
ATCGTTGTCA GAAGTAAGTT GGCCGCAGTG TTATCACTCA TGGTTATGGC 9800 
AGCACTGCAT AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCTTTTCTG 9850 
TGACTGGTGA GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGCGA 9 900 
CCGAGTTGCT CTTGCCCGGC GTCAATACGG GATAATACCG CGCCACATAG 99 5 0 

20 CAGAACTTTA AAAGTGCTCA TCATTGGAAA ACGTTCTTCG GGGCGAAAAC 10000 
TCTCAAGGAT CTTACCGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT 10050 
GCACCCAACT GATCTTCAGC ATCTTTTACT TTCACCAGCG TTTCTGGGTG 10100 
AGCAAAAACA GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC 10150 
GGAAATGTTG AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT 10200 

25 TATCAGGGTT ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA 10250 
AAATAAACAA ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCAC 10297 



30 



SEQ ID NO:30 (pTnMod(Oval/ENT tag/P146/PA) - QUAIL) 



CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG 50 
CGCAGCGTGA CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCATCAGA TTGGCTATTG 150 
GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT TATATTGGCT 200 

35 CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 250 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCQ 300 
CGTTACATAA CTTACGGTAA ATGGCCCQCC TGGCTGACCG . CCCAACGACC 350 
CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 400 
GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA 450 

40 CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGACG 500 
TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA 550 
TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 600 
CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG 650 
ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG 700 

45 TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC 750 
CCCATTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 800 
GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCACGC 850 
TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG 900 
GGAACGGTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT GACGTAAGTA 950 

50 CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 1000 
CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTG 1050 
ATGGTATAGC TTAGCCTATA GGTGTGGGTT ATTGACCATT ATTGACCACT 1100 
CCCCTATTGG TGACGATACT TTCCATTACT AATCCATAAC ATGGCTCTTT 1150 
GCCACAACTA TCTCTATTGG CTATATGCCA ATACTCTGTC CTTCAGAGAC 1200 

55 TGACACGGAC TCTGTATTTT TACAGGATGG GGTCCCATTT ATTATTTACA 1250 
AATTCACATA TACAACAACG CCGTCCCCCG TGCCCGCAGT TTTTATTAAA 1300 
CATAGCGTGG GATCTCCACG CGAATCTCGG GTACGTGTTC CGGACATGGG 1350 
CTCTTCTCCG GTAGCGGCGG AGCTTCCACA TCCGAGCCCT GGTCCCATGC 1400 
CTCCAGCGGC TCATGGTCGC TCGGCAGCTC CTTGCTCCTA ACAGTGGAGG 1450 

60 CCAGACTTAG GCACAGCACA ATGCCCACCA CCACCAGTGT GCCGCACAAG 1500 
GCCGTGGCGG TAGGGTATGT GTCTGAAAAT GAGCX3TGGAG ATTGGGCTCG 1550 

121 
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CACGGCTGAC GCAGATGGAA GACTTAAGGC AGCGGCAGAA GAAGATGCAG 1600 

GCAGCTGAGT TGTTGTATTC TGATAAGAGT CAGAGGTAAC TCCCGTTGCG 1650 

GTGCTGTTAA CGGTGGAGGG CAGTGTAGTC TGAGCAGTAC TCGTTGCTGC 1700 

CGCGCGCGCC ACCAGACATA ATAGCTGACA GACTAACAGA CTGTTCCTTT 1750 

5 ■ CCATGGGTCT TTTCTGCAGT CACCGTCGGA CCATGTGTGA ACTTGATATT 1800 

• ' TTACATGATT- CTCTTTACCA ATTCTGCCCC GAATTACACT TAAAACGACT 1850 

CAACAGCTTA ACGTTGGCTT GCCACGCATT ACTTGACTGT AAAACTCTCA 1900 

CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAGCGAG AACAAAACAT 1950 

AACATCAAAC GAATCGACCG ATTGTTAGGT AATCGTCACC TCCACAAAGA 2000 

10 GCGACTCGCT GTATACCGTT GGCATGCTAG CTTTATCTGT TCGGGAATAC 2050 

GATGCCCATT GTACTTGTTG ACTGGTCTGA TATTCGTGAG CAAAAACGAC 2100 

TTATGGTATT GCX3AGCTTCA GTCGCACTAC ACGGTCGTTC TGTTACTCTT 2150 

TATGAGAAAG CGTTCCCGCT TTCAGAGCAA TGTTCAAAGA AAGCTCATGA 2200 

CCAATTTCTA GCCGACCTTG CGAGCATTCT ACCGAGTAAC ACCACACCGC 2250 

15 TCATTGTCAG TGATGCTGGC TTTAAAGTGC CATGGTATAA ATCCGTTGAG 2300 

AAGCTGGGTT GGTACTGGTT AAGTCGAGTA AGAGGAAAAG TACAATATGC 23 50 

AGACCTAGGA GCGGAAAACT GGAAACCTAT CAGCAACTTA CATGATATGT 24 00 

CATCTAGTCA CTCAAAGACT TTAGGCTATA AGAGGCTGAC TAAAAGCAAT 24 50 

CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTCGCTCTA AAGGCCGAAA 25 00 

20 AAATCAGCGC TCGACACGGA CTCATTGTCA CCACCCGTCA CCTAAAATCT 2550 

ACTCAGCGTC GGCAAAGGAG CCATGGGTTC TAGCAACTAA CTTACCTGTT 26 00 

GAAATTCGAA CACCCAAACA ACTTGTTAAT ATCTATTCGA AGCGAATGCA 2650 

GATTGAAGAA ACCTTCCGAG ACTTGAAAAG TCCTGCCTAC GGACTAGGCC 2700 

TACGCCATAG CCGAACGAGC AGCTCAGAGC GTTTTGATAT CATGCTGCTA 2750 

25 ATCGCCCTGA TGCTTCAACT AACATGTTGG CTTGCGGGCG TTCATGCTCA 28 00 

GAAACAAGGT TGGGACAAGC ACTTCCAGGC TAACACAGTC AGAAATCGAA 2850 

ACGTACTCTC AACAGTTCGC TTAGGCATGG AAGTTTTGCG GCATTCTGGC 2 900 

TACACAATAA CAAGGGAAGA CTTACTCGTG GCTGCAACCC TACTAGCTCA 2 950 

AAATTTATTC ACACATGGTT ACGCTTTGGG GAAATTATGA TAATGATCCA 3000 

30 GATCACTTCT GGCTAATAAA AGATCAGAGC TCTAGAGATC TGTGTGTTGG 3050 

TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT GCCAGCCATC TGTTGTTTGC 3100 

CCCTCCCCCG TGCCTTCCTT GACCCTGGAA GGTGCCACTC CCACTGTCCT 3150 

TTCCTAATAA AATGAGGAAA TTGCATCGCA TTGTCTGAGT AGGTGTCATT 3200 

CTATTCTGGG GGGTGGGGTG GGGCAGCACA GCAA6GGGGA GGATTGGGAA 3250 

35 GACAATAGCA GGCATGCTGG GGATGCGGTG GGCTCTATGG GTACCTCTCT 3300 

CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCGGTAC CTCTCTCTCT 3350 

CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCAGG TGCTGAAGAA 3400 

TTGACCCGGT GACCAAAGGT GCCTTTTATC ATCACTTTAA AAATAAAAAA 3450 

CAATTACTCA GTGCCTGTTA TAAGCAGCAA TTAATTATGA TTGATGCCTA 3500 

40 CATCACAACA AAAACTGATT TAACAAATGG TTGGTCTGCC TTAGAAAGTA 3550 

TATTTGAACA TTATCTTGAT TATATTATTG ATAATAATAA AAACCTTATC 3600 

CCTATCCAAG AAGTGATGCC TATCATTGGT TGGAATGAAC TTGAAAAAAA 3650 

TTAGCCTTGA ATACATTACT GGTAAGGTAA ACGCCATTGT CAGCAAATTG 3700 

ATCCAAGAGA ACCAACTTAA AGCTTTCCTG ACGGAATGTT AATTCTCGTT 3750 

45 GACCCTGAGC ACTGATGAAT CCCCTAATGA TTTTGGTAAA AATCATTAAG 3800 

TTAAGGTGGA TACACATCTT GTCATATGAT CCCGGTAATG TGAGTTAGCT 3 850 

CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 3 900 

TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 3950 

CATGATTACG CCAAGCGCGC AATTAACCCT CACTAAAGGG AACAAAAGCT 4000 

50 GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG AACTAGTGGA TCCCCCGGGG 4 050 

AGGTCAGAAT GGTTTCTTTA CTGTTTGTCA ATTCTATTAT TTCAATACAG 4100 

AACAAAAGCT TCTATAACTG AAATATATTT GCTATTGTAT ATTATGATTG 4150 

TCCCTCGAAC CATGAACACT CCTCCAGCTG AATTTCACAA TTCCTCTGTC 4 200 

ATCTGCCAGG CTGGAAGATC ATGGAAGATC TCTGAGGAAC ATTGCAAGTT 4250 

55 CATACCATAA ACTCATTTGG AATTGAGTAT TATTTTGCTT TGAATGGAGC 4300 

TATGTTTTGC AGTTCCCTCA GAAGAAAAGC TTGTTATAAA GCGTCTACAC 4350 

CCATCAAAAG ATATATTTAA ATATTCCAAC TACAGAAAGA TTTTGTCTGC 44 00 

TCTTCACTCT GATCTCAGTT GGTTTCTTCA CGTACATGCT TCTTTATTTG 4450 

CCTATTTTGT CAAGAAAATA ATAGGTCAAG TCCTGTTCTC ACTTATCTCC 4500 

60 TGCCTAGCAT GGCTTAGATG CACGTTGTAC ATTCAAGAAG GATCAAATGA 4 550 

AACAGACTTC TGGTCTGTTA CAACAACCAT AGTAATAAAC AGACTAACTA 4600 
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• ATAATTGCTA ATTATGTTTT CCATCTCTAA GGTTCCCACA TTTTTCTGTT 4650 
TTAAGATCCC ATTATCTGGT TGTAACTGAA GCTCAATGGA ACATGAACAG 4700 
TATTTCTCAG TCTTTTCTCC AGCAATCCTG ACGGATTAGA AGAACTGGCA 4750 
GAAAACACTT TGTTACCCAG AATTAA7VAAC TAATATTTGC TCTCCCTTCA 4800 
■5 ATCCAAAATG GACCTATTGA AACTAAAATC TGACCCAATC CCATTAAATT 4 850 
ATTTCTATGG CX5TCAAAGGT CAAACTTTTG 'aAGGGAACCT GTGGGTGGGT 4 900 
CCCA/^.TTCAG GCTATATATT CCCCAGGGCT CAGCCAGTGG ATCCATGGGC 4950 
TCCATCGGTG CAGCAAGCAT GGAATTTTGT TTTGATGTAT TCAAGGAGCT 5000 
CAAAGTCCAC CATGCCAATG ACAACATGCT CTACTCCCCC TTTGCCATCT 5050 

10 TGTCAACTCT GGCCATGGTC TTCCTAGGTG CAAAAGACAG CACCAGGACC 5100 
CAGATAAATA AGGTTGTTCA CTTTGATAAA CTTCCAGGAT TCGGAGACAG 5150 
TATTGAAGCT CAGTGTGGCA CATCTGTAAA TGTTCACTCT TCACTTAGAG 5200 
ACATACTCAA CCAAATCACC AAACAAAATG ATGCTTATTC GTTCAGCCTT 5250 
GCCAGTAGAC TTTATGCTCA AGAGACATAC ACAGTCGTGC CGGAATACTT 53 00 

15 GCAATGTGTG AAGGAACTGT ATAGAGGAGG CTTAGAATCC GTCAACTTTC 5350 
AAACAGCTGC AGATCAAGCC AGAGGCCTCA TCAATGCCTG GGTAGAAAGT 54 00 
CAGACAAACG GAATTATCAG AAACATCCTT CAGCCAAGCT CCGTGGATTC 5450 
TCAAACTGCA ATGGTCCTGG TTAATGCCAT TGCCTTCAAG GGACTGTGGG 5 500 
AGAAAGCATT TAAGGCTGAA GACACGCAAA CAATACCTTT CAGAGTGACT 5 550 

20 GAGCAAGAAA GCAAACCTGT GCAGATGATG TACCAGATTG GTTCATTTAA 5600 
AGTGGCATCA ATGGCTTCTG AGAAAATGAA GATCCTGGAG CTTCCATTTG 5650 
CCAGTGGAAC AATGAGCATG TTGGTGCTGT TGCCTGATGA TGTCTCAGGC 5700 
CTTGAGCAGC TTGAGAGTAT AATCAGCTTT GAAAAACTGA CTGAATGGAC 5750 
CAGTTCTAGT ATTATGGAAG AGAGGAAGGT CAAAGTGTAC TTACCTCGCA 5800 

25 TGAAGATGGA GGAGAAATAC AACCTCACAT CTCTCTTAAT GGCTATGGGA 5850 
ATTACTGACC TGTTCAGCTC TTCAGCCAAT CTGTCTGGCA TCTCCTCAGT 5900 
AGGGAGCCTG AAGATATCTC AAGCTGTCCA TGCAGCACAT GCAGAAATCA 5950 
ATGAAGCGGG CAGAGATGTG GTAGGCTCAG CAGAGGCTGG AGTGGATGCT 6000 
ACTGAAGAAT TTAGGGCTGA CCATCCATTC CTCTTCTGTG TCAAGCACAT 6050 

30 CGAAACCAAC GCCATTCTCC TCTTTGGCAG ATGTGTTTCT CCGCGGCCAG 6100 
CAGATGACGC ACCAGCAGAT GACGCACCAG CAGATGACGC ACCAGCAGAT 6150 
GACGCACCAG CAGATGACGC ACCAGCAGAT GACGCAACAA CATGTATCCT 6200 
GAAAGGCTCT TGTGGCTGGA TCGGCCTGCT GGATGACGAT GACAAAAAAT 6250 
ACAAAAAAGC ACTGAAAAAA CTGGCAAAAC TGCTGTAATG AGGGCGCCTG 6300 

35 GATCCAGATC ACTTCTGGCT AATAAAAGAT CAGAGCTCTA GAGATCTGTG 6350 
TGTTGGTTTT TTGTGGATCT GCTGTGCCTT CTAGTTGCCA GCCATCTGTT 6400 
GTTTGCCCCT CCCCCGTGCC TTCCTTGACC CTGGAAGGTG CCACTCCCAC 6450 
TGTCCTTTCC TAATAAAATG AGGAAATTGC ATCGCATTGT CTGAGTAGGT 6500 
GTCATTCTAT TCTGGGGGGT GGGGTGGGGC AGCACAGCAA GGGGGAGGAT 6550 

40 TGGGAAGACA ATAGCAGGCA TGCTGGGGAT GCGGTGGGCT CTATGGGTAC 6600 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCTCT 6650 
CTCGAGGGGG GGCCCGGTAC CCAATTCGCC CTATAGTGAG TCGTATTACG 6700 
CGCGCTCACT GGCCGTCGTT TTACAACGTC GTGACTGGGA AAACCCTGGC 6750 
GTTACCCAAC TTAATCGCCT TGCAGCACAT CCCCCTtTCG CCAGCTGGCG 6800 

45 TAATAGCGAA GAGGCCCGCA CCGATCGCCC TTCCCAACAG TTGCGCAGCC 6850 
TGAATGGCGA ATGGAAATTG TAAGCGTTAA TATTTTGTTA AAATTCGCGT 6900 
TAAAT.TTTTG TTAAATCAGC TCATTTTTTA ACCAATAGGC CGAAATCGGC 6950 
AAAATCCCTT ATAAATCAAA AGAATAGACC GAGATAGGGT TGAGTGTTGT 7000 
TCCAGTTTGG AACAAGAGTC CACTATTAAA GAACGTGGAC TCCAACGTCA 7050 

50 AAGGGCGAAA AACCGTCTAT CAGGGCGATG GCCCACTACT CCGGGATCAT 7100 
ATGACAAGAT GTGTATCCAC CTTAACTTAA TGATTTTTAC CAAAATCATT 7150 
AGGGGATTCA TCAGTGCTCA GGGTCAACGA 6AATTAACAT TCCGTCAGGA 7200 
AAGCTTATGA TGATGATGTG CTTAAAAACT TACTCAATGG CTGGTTATGC 7250 
ATATCGCAAT ACATGCGAAA AACCTAAAAG AGCTTGCCGA TAAAAAAGGC 7300 

55 CAATTTATTG CTATTTACCG CGGCTTTTTA TTGAGCTTGA AAGATAAATA 7350 
AAATAGATAG GTTTTATTTG AAGCTAAATC TTCTTTATCG TAAAAAATGC 7400 
CCTCTTGGGT TATCAAGAGG GTCATTATAT TTCGCGGAAT AACATCATTT 7450 
GGTGACGAAA TAACTAAGCA CTTGTCTCCT GTTTACTCCC CTGAGCTTGA 7500 
GGGGTTAACA TGAAGGTCAT CGATAGCAGG ATAATAATAC AGTAAAACGC 7550 

60 TAAACCAATA ATCCAAATCC AGCCATCCCA AATTGGTAGT GAATGATTAT 7600 
AAATAACAGC AAACAGTAAT GGGCCAATAA CACCGGTTGC ATTGGTAAGG 7650 
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GTCACCAATA ATCCCTGTAA" AGCACCTTGC -TGATGACTCT TTGTTTGG AT . 7 7 0 0 
AGACATCACT CCCTGTAATG CAGGTAAAGC GATCCCACCA CCAGCCAATA 7 750 
AAATTAAAAC AGGGAAAACT AACCAACCTT CAGATATAAA CGCTAAAAAG 7800 
GCAAATGCAC TACTATCTGC AATAAATCCG AGCAGTACTG CCGTTTTTTC 7850 
5 GCCCCATTTA GTGGCTATTC TTCCTGCCAC AAAGGCTTGG AATACTGAGT 7900 
GTAAAAGACC AAGACCCGCT AATGAAAAGC CAACCATCAT GCTATTCCAT 7950 
CCAAAACGAT TTTCGGTAAA TAGCACCCAC ACCGTTGCGG GAATTTGGCC 8000 
TATCAATTGC GCTGAAAAAT AAATAATCAA CAAAATGGCA TCGTTTTAAA 8050 
TAAAGTGATG TATACCGAAT TCAGCTTTTG TTCCCTTTAG TGAGGGTTAA 8100 

10 TTGCGCGCTT GGCGTAATCA TGGTCATAGC TGTTTCCTGT GTGAAATTGT 8150 
TATCCGCTCA CAATTCCACA CAACATACGA GCCGGAAGCA TAAAGTGTAA 8200 
AGCCTGGGGT GCCTAATGAG TGAGCTAACT CACATTAATT GCGTTGCGCT 8250 
CACTGCCCGC TTTCCAGTCG GGAAACCTGT CGTGCCAGCT GCATTAATGA 8300 
ATCGGCCAAC GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTCTTCCGC 8350 

15 TTCCTCGCTC ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG 84 00 
TATCAGCTCA CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT 84 50 
AACGCAGGAA AGAACATGTG AGCAAAAGGC CAGCAAAAGG CCAGGAACCG 8500 
TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC CCCCCTGACG 8550 
AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA CCCGACAGGA 8600 

20 CTATAAAGAT ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC 8650 
TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG 87 00 
GAAGCGTGGC GCTTTCTCAT AGCTCACGCT GTAGGTATCT CAGTTCGGTG 8750 
TAGGTCGTTC GCTCCAAGCT GGGCTGTGTG CACGAACCCC CCGTTCAGCC 8800 
CGACCGCTGC GCCTTATCCG GTAACTATCG TCTTGAGTCC AACCCGGTAA 8850 

25 GACACGACTT ATCGCCACTG GCAGCAGCCA CTGGTAACAG GATTAGCAGA 8900 
GCGAGGTATG TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA 8 950 
CGGCTACACT AGAAGGACAG TATTTGGTAT CTGCGCTCTG CTGAAGCCAG 9000 
TTACCTTCGG AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC 9050 
GCTGGTAGCG GTGGTTTTTT TGTTTGCAAG CAGCAGATTA CGCGCAGAAA 9100 

30 AAAAGGATCT CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC 9150 
AGTGGAACGA AAACTCACGT TAAGGGATTT TGGTCATGAG ATTATCAAAA 9200 
AGGATCTTCA CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT 9250 
CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTACCAA TGCTTAATCA 9300 
GTGAGGCACC TATCTCAGCG ATCTGTCTAT TTCGTTCATC CATAGTTGCC 9350 

35 TGACTCCCCG TCGTGTAGAT AACTACGATA CGGGAGGGCT TACCATCTGG 9400 
CCCCAGTGCT GCAATGATAC CGCGAGACCC ACGCTCACCG GCTCCAGATT 9450 
TATCAGCAAT AAACCAGCCA GCCGGAAGG6 CCGAGCGCAG AAGTGGTCCT 9500 
GCAACTTTAT CCGCCTCCAT CCAGTCTATT AATTGTTGCC GGGAAGCTAG 9550 
AGTAAGTAGT TCGCCAGTTA ATAGTTTGCG CAACGTTGTT GCCATTGCTA 9600 

40 CAGGCATCGT GGTGTCACGC TCGTCGTTTG GTATGGCTTC ATTCAGCTCC 9650 
GGTTCCCAAC GATCAAGGCG AGTTACATGA TCCCCCATGT TGTGCAAAAA 9700 
AGCGGTTAGC TCCTTCGGTC CTCCGATCGT TGTCAGAAGT AAGTTGGCCG 9750 
CAGTGTTATC ACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC 9800 
ATGCCATCCG TAAGATGCTT TTCTGTGACT GGTGAGTACT CAACCAAGTC 9850 

45 ATTCTGAGAA TAGTGTATGC GGCGACCGAG TTGCTCTTGC CCGGCGTCAA 9900 
TACGGGATAA TACCGCGCCA CATAGCAGAA CTTTAAAAGT GCTCATCATT 9950 
GGAAAACGTT CTTCGGGGCG AAAACTCTCA AGGATCTTAC CGCTGTTGAG 10000 
ATCCAGTTCG ATGTAACCCA CTCGTGCACC CAACTGATCT TCAGCATCTT 10050 
TTACTTTCAC CAGCGTTTCT GGGTGAGCAA AAACAGGAAG GCAAAATGCC 10100 

50 GCAAAAAAGG GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT 10150 
CCTTTTTCAA TATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG 10200 
GATACATaTT TGAATGTATT TAGAAAAATA AACAAATAGG GGTTCCGCGC 10250 
ACATTTCCCC GAAAAGTGCC AC 10272 

55 

SEQ ID NO: 31 (pTnMod (Oval/BNT tag/Proins/PA) - Chicken) 

CTGACGCGCC CTGTAGCX5GC GCATTAAGCG CX3GCGGGTGT GGTGGTTACG 50 

CGCAGCGTGA CGGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC 100 

60 TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCATCAGA TTGGCTATTG 150 

GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT TATATTGGCT 200 
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CATGTCCAAC ATTACCGCCA TCTTGACATT' 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT 
CGTTACATAA CTTACGGTAA ATGGCCCGCC 
CCCGCCCATT GACGTCAATA ATGAC6TATG 
5 GGGACTTTCC ATTGACGTCA ATGGGTGGAG 
CTTGGCAGTA CATCAAGTGT ATCATATGCC 
TCAATGACGG TAAATGGCCC GCCTGGCATT 
TGGGACTTTC CTACTTGGCA GTACATCTAC 
CATGGTGATG CGGTTTTGGC AGTACATCAA 

10 ACTCACGGGG ATTTCCAAGT CTCCACCCCA 
TTTTGGCACC AAAATCAACG GGACTTTCCA 
CCCATTGACG CAAATGGGCG GTAGGCGTGT 
GCAGAGCTCG TTTAGTGAAC CGTCAGATCG 
TGTTTTGACC TCCATAGAAG ACACCGGGAC 

15 GGAACGGTGC ATTGGAACGC GGATTCCCCG 
CCGCCTATAG ACTCTATAGG CACACCCCTT 
CTGTTTTTGG CTTGGGGCCT ATACACCCCC 
ATGGTATAGC TTAGCCTATA GGTGTGGGTT 
CCCCTATTGG TGACGATACT TTCCATTACT 

20 GCCACAACTA TCTCTATTGG CTATATGCCA 
TGACACGGAC TCTGTATTTT TACAGGATGG 
AATTCACATA TACAACAACG CCGTCCCCCG 
CATAGCGTGG GATCTCCACG CGAATCTCGG 
CTCTTCTCCG GTAGCGGCGG AGCTTCCACA 

25 CTCCAGCGGC TCATGGTCGC TCGGCAGCTC 
CCAGACTTAG GCACAGCACA ATGCCCACCA 
GCCGTGGCGG TAGGGTATGT GTCTGAAAAT 
CACGGCTGAC GCAGATGGAA GACTTAAGGC 
GCAGCTGAGT TGTTGTATTC TGATAAGAGT 

30 GTGCTGTTAA CGGTGGAGGG CAGTGTAGTC 
CGCGCGCGCC ACCAGACATA ATAGCTGACA 
CCATGGGTCT TTTCTGCAGT CACCGTCGGA 
TTACATGATT CTCTTTACCA ATTCTGCCCC 
CAACAGCTTA ACGTTGGCTT GCCACGCATT 

35 CTCTTACCGA ACTTGGCCGT AACCTGCCAA 
AACATCAAAC GAATCGACCG ATTGTTAGGT 
GCGACTCGCT GTATACCGTT GGCATGCTAG 
GATGCCCATT GTACTTGTTG ACTGGTCTGA 
TTATGGTATT GCGAGCTTCA GTCGCACTAC 

40 TATGAGAAAG CGTTCCCGCT TTCAGAGCAA 
CCAAT.TTCTA GCCGACCTTG CGAGCATTCT 
TCATTGTCAG TGATGCTGGC TTTAAAGTGC 
AAGCTGGGTT GGTACTGGTT AAGTCGAGTA 
AGACCTAGGA GCGGAAAACT GGAAACCTAT 

45 CATCTAGTCA CTCAAAGACT TTAGGCTATA 
CCAATCTCAT GCCAAATTCT ATTGTATAAA 
AAATCAGCGC TCGACACGGA CTCATTGTCA 
ACTCAGCGTC GGCAAAGGAG CCATGGGTTC 
GAAATTCGAA CACCCAAACA ACTTGTTAAT 

50 GATTGAAGAA ACCTTCCGAG ACTTGAAAAG 
TACGCCATA6 CCX3AACGAGC AGCTCAGAGC 
ATCGCCCTGA TGCTTCAACT AACATGTTGG 
GAAACAAGGT TGGGACAAGC ACTTCCAGGC 
ACGTACTCTC AACAGTTCGC TTAGGCATGG 

55 TACACAATAA CAAGGGAAGA CTTACTCGTG 
AAATTTATTC ACACATGGTT ACGCTTTGGG 
GATCACTTCT GGCTAATAAA AGATCAGAGC 
TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT 
CCCTCCCCCG TGCCTTCCTT GACCCTGGAA 

60 TTCCTAATAA AATGAGGAAA TTGCATCGCA 
CTATTCTGGG GGGTGGGGT6 GGGCAGCACA 

ATUIB02 



PCT/US2003/020389 



GATTATTGAt. 


i AVa ATTAA 


£ 3v 


AGCCCATATA 


1 GGAGTTCCG 


300 


TGGCTGACCG 


CCCAACGACC 


350 


TTCCCATAGT 


A A ^/^^^ A A W A 

AACGCCAATA 


4 00 


Ik ^^^^^^^^^ R ^^^^^^^^^^ 

TATTTACGUT 


AAA ^n»/'i r^r*fy A 

AAACTGCCCA 


>l C A 

450 




pp*i» A n^pp A 


C A A 




p a T/^ a ppTT a 




i A 1 I A^ 1 1. A 


Tpp PT a TT a p 
1 1 A I 1 AC 






T a P PPPTT'TP 


CCA 


1 1 uA\.u 1 L.AA 


TPPP A PTTTP 




AAAlol^uIA 


apa ac^pppp 
ALAALX UV.^L 


/DO 


ALpUO 1 ^sV^oAU 


PTPfaTa'pa a 
IjILXAXAXAA 


D A A 
bOO 


1 vjIjAQjAL-U 


LLAi L-LALvjL 


OCA 

bbO 


L OA i C L A(jL C 




9 00 


i ouLAACjAvj i 


LjALU i AACj 1 A 




itjodCi 1 Ai 


OLAHjL iAl a 


1 r» A r\ 
1000 




ppT'aT'app'T'p 
o V. X A 1 AuUi X V7 


1 c n 
X Ub U 


* rrirpp A 1^/^ A TT 
/\1 iUAL-^Al 1 


a TT'P a PPA PT 
A i X vaAL V-AL. X 


T 1 A A 
XIUO 


A aTPPaTZi ar* 


A 1 OOL. i ^_ X X i 


X X u 


A i AL. 1 J. o i 


PTTP a p ap a p 
L 1 X LAijA^jAL. 


1 O A A 


p p T r» rr* a T'TT' 


a TT* a T "PT' a p a 
Al XAi i. XAL.A 


1 O C A 

X Z o u 


1 ui. AV3 i, 


nrT"T"r' a TT" a a a 
X X X X AX X AAA 


1 "J AA 
1 J UU 


P T A PPTP T"r P 


PPP AP AT*/^^/^ 


1 C A 

1 J bO 


*PPPP A ^ PO^T* 


PP'PP^^ A ^ 

GGTCCCATVjC 


1 j4 A A 

1400 


CTTGCTCCTA 


ACAGTGGAGG 


1450 


CCACCAGTGT 


GCCGCACAAG 


1500 


GAGCGTGGAG 


ATTGGGCTCG 


1550 


AGCGGCAGAA 


GAAGATGCAG 


1600 


CAGAGGTAAC 


TCCCGTTGCG 


1650 


TGAGCAGTAC 


TCGTTGCTGC 


1700 


GACTAACAGA 


CTGTTCCTTT 


1750 


CCATGTGTGA 


ACTTGATATT 


1 a A A 

1800 


GAATTACACT 


fHTV TV TV TV ^/^TV ^^fl^ 

TAAAACGACT 


1850 


ACT^l bALT(»T 


A A A A ^'P/*)1*^ A 

AAAACTCTCA 


1 B A A 

1900 


L. AAA(9 CasAvs 


A A/^A A A A/^AK 

AACAAAALAT 


1 o e A 
1950 


AATCC^TLALC 


T*/^/^ A A A A^ A 

TC CACAAAGA 


O A A A 

2000 


Ci 1 lAiUXui. 


T CGGG AATAC 


1 A C A 

2050 


T ATT CGTGAG 


^ A A A A A f^C* A 

CAAAAACGAC 


IT Art 

2100 


ACGGTCGTTC 


TGTTACTCTT 


2150 


rp/^cnrn/^ AAA A 

TGTTCAAAGA 


A A f^f^T^fy A T>^ A 

AAG CT C ATGA 


1 1 A A 

2200 


A f^f^r^ A /"^T" A A ^ 

At-CijACj i AAL 


ALLACAH-(jC. 


1 1 C A 

22 50 


pR'ppPT'aT'a a 
LA i \3\3 1 A i AA 


aTPPPTHTP a r* 


1 1 A A 

23 00 


ap app a a a ap 
AOAkivjAAAAVa 


T* a p a a T a tp p 
XAv_AAX AXIjL 


O "J C A 

2 J b U 


p a pp a a ptt^ a 
L.AuV«AAC 1 1 A 


p a TP a T a tp t* 
LAXuAX AXuX 


O A A A 
Z 4 U U 


PL\3i\\3\J\^ 1 U AU 


TaaaappaaT 
X AAAAULAA X 


T 4 C A 
Z4b U 


T PTPP P"rPT a 


a apnpppa a a 

AAuVjC L kjAAA 


C A A 
2bUU 


p p a pp/^rsTP a 


ppTa a a aTPT 
LL X AAAAX L X 


*5 c c rt 
2bb U 




/ "i"i'a PPTPT'T' 
LX XALLXVjX X 


oc Art 
2o UU 


a T/^l* A T^^^^^^ A 




OC C A 
2030 


TCwTGCGTAC 


GGACTAGGCC 


T A A 

2700 


GTTTTGATAT 


CATGCTGCTA 


2750 


CTTGCGGGCG 


TTCATGCTCA 


2800 


TAACACAGTC 


Tl ^ TV ^^^^ % 1L 

AGAAATCGAA 


A P A 

2850 


AAGTTTTGCG 


GCATTCTGGC 


2900 


GCTGCAACCC 


TACTAGCTCA 


2950 


GAAATTATGA 


TAATGATCCA 


3000 


TCTAGAGATC 


TGTGTGTTGG 


3050 


GCCAGCCATC 


TGTTGTTTGC 


3100 


GGTGCCACTC 


CCACTGTCCT 


3150 


TTGTCTGAGT 


AGGTGTCATT 


3200 


GCAAGGGGGA 


GGATTGGGAA 


3250 



GACAATAGCA- GGCATGCTGG GGATGCGGTG GGCTCTATGG GTACCTCTCT 3 300 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCGGTAC CTCTCTCTCT 3350 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCAGG TGCTGAAGAA 3400 
. TTGACCCGGT GACCAAAGGT GCCTTTTATC ATCACTTTAA AAATAAAAAA 3450 
CAATTACTCA GTGCCTGTTA TAAGCAGCAA TTAATTATGA TTGATGCCTA 3500 
CATCACAACA AAAACTGATT TAACAAATGG TTGGTCTGCC TTAGAAAGTA 3550 
TATTTGAACA TTATCTTGAT TATATTATTG ATAATAATAA AAACCTTATC 3600 
CCTATCCAAG AAGTGATGCC TATCATTGGT TGGAATGAAC TTGAAAAAAA 3650 
TTAGCCTTGA ATACATTACT GGTAAGGTAA ACGCCATTGT CAGCAAATTG 3700 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG ACGGAATGTT AATTCTCGTT 3750 
GACCCTGAGC ACTGATGAAT CCCCTAATGA TTTTGGTAAA AATCATTAAG 3800 
TTAAGGTGGA TACACATCTT GTCATATGAT CCCGGTAATG TGAGTTAGCT 3850 
CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 3900 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 3950 
CATGATTACG CCAAGCGCGC AATTAACCCT CACTAAAGGG AACAAAAGCT 4 000 
GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG AACTAGTGGA TCCCCCGGGG 4 050 
AGGTCAGAAT GGTTTCTTTA CTGTTTGTCA ATTCTATTAT TTCAATACAG 4100 
AACAATAGCT TCTATAACTG AAATATATTT GCTATTGTAT ATTATGATTG 4150 
TCCCTCGAAC CATGAACACT CCTCCAGCTG AATTTCACAA TTCCTCTGTC 4 200 
ATCTGCCAGG CCATTAAGTT ATTCATGGAA GATCTTTGAG GAACACTGCA 4 250 
AGTTCATATC ATAAACACAT TTGAAATTGA GTATTGTTTT GCATTGTATG 4 3 00 
GAGCTATGTT TTGCTGTATC CTCAGAAAAA AAGTTTGTTA TAAAGCATTC 4 350 
ACACCCATAA AAAGATAGAT TTAAATATTC CAGCTATAGG AAAGAAAGTG 44 00 
CGTCTGCTCT TCACTCTAGT CTCAGTTGGC TCCTTCACAT GCATGCTTCT 44 50 
TTATTTCTCC TATTTTGTCA AGAAAATAAT AGGTCACGTC TTGTTCTCAC 4500 
TTATGTCCTG CCTAGCATGG CTCAGATGCA CGTTGTAGAT ACAAGAAGGA 4 550 
TCAAATGAAA CAGACTTCTG GTCTGTTACT ACAACCATAG TAATAAGCAC 4600 
ACTAACTAAT AATTGCTAAT TATGTTTTCC ATCTCTAAGG TTCCCACATT 4650 
TTTCTGTTTT CTTAAAGATC CCATTATCTG GTTGTAACTG AAGCTCAATG 4700 
GAACATGAGC AATATTTCCC AGTCTTCTCT CCCATCCAAC AGTCCTGATG 4750 
GATTAGCAGA ACAGGCAGAA AACACATTGT TACCCAGAAT TAAAAACTAA 4800 
TATTTGCTCT CCATTCAATC CAAAATGGAC CTATTGAAAC TAAAATCTAA 4850 
CCCAATCCCA TTAAATGATT TCTATGGCGT CAAAGGTCAA ACTTCTGAAG 4900 
GGAACCTGTG GGTGGGTCAC AATTCAGGCT ATATATTCCC CAGGGCTCAG 4950 
CGGATCCATG GGCTCCATCG GCGCAGCAAG CATGGAATTT TGTTTTGATG 5000 
TATTCAAGGA GCTCAAAGTC CACCATGCCA ATGAGAACAT CTTCTACTGC 5050 
CCCATTGCCA TCATGTCAGC TCTAG.CCATG GTATACCTGG GTGCAAAAGA 5100 
CAGCACCAGG ACACAGATAA ATAAGGTTGT TCGCTTTGAT AAACTTCCAG 5150 
GATTCGGAGA CAGTATTGAA GCTCAGTGTG GCACATCTGT AAACGTTCAC 5200 
TCTTCACTTA GAGACATCCT CAACCAAATC ACCAAACCAA ATGATGTTTA 5250 
TTCGTTCAGC CTTGCCAGTA GACTTTATGC TGAAGAGAGA TACCCAATCC 53 00 
TGCCAGMTA CTTGCAGTGT GTGAAGGAAC TGTATAGAGG AGGCTTGGAA 53 50 
CCTATCAACT TTCAAACAGC TGCAGATCAA GCCAGAGAGC TCATCAATTC 54 00 
CTGGGTAGAA AGTCAGACAA ATGGAATTAT CAGAAATGTC CTTCAGCCAA 54 50 
GCTCCGTGGA TTCTCAAACT GCAATGGTTC TGGTTAATGC CATTGTCTTC 5500 
AAAGGACTGT GGGAGAAAAC ATTTAAGGAT GAAGACACAC AAGCAATGCC 5550 
TTTCAGAGTG ACTGAGCAAG AAAGCAAACC TGTGCAGATG ATGTACCAGA 5600 
TTGGTTTATT TAGAGTGGCA TCAATGGCTT CTGAGAAAAT GAAGATCCTG 5650 
GAGCTTCCAT TTGCCAGTGG GACAATGAGC ATGTTGGTGC TGTTGCCTGA 5700 
TGAAGTCTCA GGCCTTGAGC AGCTTGAGAG TATAATCAAC TTTGAAAAAC 5750 
TGACTGAATG GACCAGTTCT AATGTTATGG AAGAGAGGAA GATCAAAGTG 5800 
TACTTACCTC GCATGAAGAT GGAGGAAAAA TACAACCTCA CATCTGTCTT 5850 
AATGGCTATG GGCATTACTG ACGTGTTTAG CTCTTCAGCC AATCTGTCTG 5900 
GCATCTCCTC AGCAGAGAGC CTGAAGATAT CTCAAGCTGT CCATGCAGCA 5950 
CATGCAGAAA TCAATGAAGC AGGCAGAGAG GTGGTAGGGT CAGCAGAGGC 6000 
TGGAGTGGAT GCTGCAAGCG TCTCTGAAGA ATTTAGGGCT GACCATCCAT 6050 
TCCTCTTCTG TATCAAGCAC ATCGCAACCA ACGCCGTTCT CTTCTTTGGC 6100 
AGATGTGTTT CCCCTCCGCG GCCAGCAGAT GACGCACCAG CAGATGACGC 6150 
ACCAGCAGAT GACGCACCAG CAGATGACGC ACCAGCAGAT GACGCACCAG 6200 
CAGATGACGC AACAACATGT ATCCTGAAAG GCTCTTGTGG CTGGATCGGC 6250 
CTGCTGGATG ACGATGACAA ATTTGTGAAC CAACACCTGT GCGGCTCACA 6300 
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CCTGGTGGAA GCTCTCTACC TAGTGTGCGG GGAACGAGGC TTCTTCTACA 6350 
CACCCAAGAC CCGCCGGGAG GCAGAGGACC TGCAGGTGGG GCAGGTGGAG 64 00 
CTGGGCGGGG GCCCTGGTGC AGGCAGCCTG CAGCCCTTGG CCCTGGAGGG 64 50 
GTCCCTGCAG AAGCGTGGCA TTGTGGAACA ATGCTGTACC AGCATCTGCT 6500 
5 CCCTCTACCA GCTGGAGAAC TACTGCAACT AGGGCGCCTG GATCCAGATC 6550 
• ACTTCTGGCT AATAAAAGAT CAGAGCTCTA GAGATCTGTG TGTTGGTTTT 6600 
TTGTGGATCT GCTGTGCCTT CTAGTTGCCA GCCATCTGTT GTTTGCCCCT 6650 
CCCCCGTGCC TTCCTTGACC CTGGAAGGTG CCACTCCCAC TGTCCTTTCC 6700 
TAATAAAATG AGGAAATTGC ATCGCATTGT CTGAGTAGGT GTCATTCTAT 6750 

10 TCTGGGGGGT GGGGTGGGGC AGCACAGCAA GGGGGAGGAT TGGGAAGACA 6800 
ATAGCAGGCA TGCTGGGGAT GCG6TGGGCT CTATGGGTAC CTCTCTCTCT 6850 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCTCT CTCGAGGGGG 6900 
GGCCCGGTAC CCAATTCGCC CTATAGTGAG TCGTATTACG CGCGCTCACT 6950 
GGCCGTCGTT TTACAACGTC GTGACTGGGA AAACCCTGGC GTTACCCAAC 7000 

15 TTAATCGCCT TGCAGCACAT CCCCCTTTCG CCAGCTGGCG TAATAGCGAA 7050 
GAGGCCCGCA CCGATCGCCC TTCCCAACAG TTGCGCAGCC TGAATGGCGA 7100 
ATGGAAATTG TAAGCGTTAA TATTTTGTTA AAATTCGCGT TAAATTTTTG 7150 
TTAAATCAGC TCATTTTTTA ACCAATAGGC CGAAATCGGC AAAATCCCTT 7200 
ATAAATCAAA AGAATAGACC GAGATAGGGT TGAGTGTTGT TCCAGTTTGG 7250 

20 AACAAGAGTC CACTATTAAA GAACGTGGAC TCCAACGTCA AAGGGCGAAA 73 00 
AACCGTCTAT CAGGGCGATG GCCCACTACT CCGGGATCAT ATGACAAGAT 7350 
GTGTATCCAC CTTAACTTAA TGATTTTTAC CAAAATCATT AGGGGATTCA 74 00 
TCAGTGCTCA GGGTCAACGA GAATTAACAT TCCGTCAGGA AAGCTTATGA 7450 
TGATGATGTG CTTAAAAACT TACTCAATGG CTGGTTATGC ATATCGCAAT 7500 

25 ACATGCGAAA AACCTAAAAG AGCTTGCCGA TAAAAAAGGC CAATTTATTG 7550 
CTATTTACCG CGGCTTTTTA TTGAGCTTGA AAGATAAATA AAATAGATAG 7600 
GTTTTATTTG AAGCTAAATC TTCTTTATCG TAAAAAATGC CCTCTTGGGT 7650 
TATCAAGAGG GTCATTATAT TTCGCGGAAT AACATCATTT GGTGACGAAA 7 700 
TAACTAAGCA CTTGTCTCCT GTTTACTCCC CTGAGCTTGA GGGGTTAACA 7750 

30 TGAAGGTCAT CGATAGCAGG ATAATAATAC AGTAAAACGC TAAACCAATA 7800 
ATCCAAATCC AGCCATCCCA AATTGGTAGT GAATGATTAT AAATAACAGC 7850 
AAACAGTAAT GGGCCAATAA CACCGGTTGC ATTGGTAAGG CTCACCAATA 7900 
ATCCCTGTAA AGCACCTTGC TGATGACTCT TTGTTTGGAT AGACATCACT 7950 
CCCTGTAATG CAGGTAAAGC GATCCCACCA CCAGCCAATA AAATTAAAAC 8000 

35 AGGGAAAACT AACCAACCTT CAGATATAAA CGCTAAAAAG GCAAATGCAC 8050 
TACTATCTGC AATAAATCCG AGCAGTACTG CCGTTTTTTC GCCCCATTTA 8100 
GT6GCTATTC TTCCTGQCAC AAAGGCTTGG AATACTGAGT GTAAAAGACC 8150 
AAGACCCGCT AATGAAAAGC CAACCATCAT GCTATTCCAT CCAAAAC6AT 8200 
TTTCGGTAAA TAGCACCCAC ACCGTTGCGG GAATTTGGCC TATCAATTGC 8250 

40 GCTGAAAAAT AAATAATCAA CAAAATGGCA TCGTTTTAAA TAAAGTGATG 8300 
TATACCGAAT TCAGCTTTTG TTCCCTTTAG TGAGGGTTAA TTGCGCGCTT 8350 
GGCGTAATCA TGQTCATAGC TGTTTCCTGT GTGAAATTGT TATCCGCTCA 8400 
CAATTCCACA CAACATACGA GCCGGAAGCA TAAAGTGTAA AGCCTGGGGT 8450 
GCCTAATGAG TGAGCTAACT CACATTAATT GCGTTGCGCT CACTGCCCGC 8500 

45 TTTCCAGTCG GGAAACCTGT CGTGCCAGCT GCATTAATGA ATCGGCCAAC 8550 
GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTCTTCCGC TTCCTCGCTC 8600 
ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA 8650 
CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA 8700 
. AGAACATGTG AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC 8750 

50 GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC CCCCCTGACG AGCATCACAA 8800 
AAATCGACGC TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT 8850 
ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC TGTTCCGACC 8900 
CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC 8 950 
GCTTTCTCAT AGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC 9000 

55 GCTCCAAGCT GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC 9050 
GCCTTATCCG GTAACTATCG TCTTGAGTCC AACCCGGTAA GACACGACTT 9100 
ATCGCCACTG GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG 9150 
TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA CGGCTACACT 9200 
AGAAGGACAG TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTACCTTCGG 9250 

60 AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG 9300 
GTGGTTTTTT TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT 9350 

127 

AT1.UB02:}34»II 



wo 2004/003157 



PCTAJS2003/020389 



10 



15 



20 



25 



caag;^agatc 

AAACTCACGT 
CCTAGATCCT 
TATGAGTAAA 
TATCTCAGCG 
TCGTGTAGAT 
GCAATGATAC 
AAACCAGCCA 
CCGCCTCCAT 
TCGCCAGTTA 
GGTGTCACGC 
GATCAAGGCG 
TCCTTCGGTC 
ACTCATGGTT 
TAAGATGCTT 
TAGTGTATGC 

taccgcgcca 
cttcggggcg 
atgtaaccca 
cagcgtttct 
gaat;^agggc 

TATTA.TTGAA 
TGAATGTATT 

gaaaagtgcc 



CTTTGATCTT 
TAAGGGATTT 
TTTAAATTAA 
CTTGGTCTGA 
ATCTGTCTAT 
AACTACGATA 
CGCGAGACCC 
GCCGGAAGGG 
CCAGTCTATT 
ATA6TTTGCG 
TCGTCGTTTG 
AGTTACATGA 
CTCC6ATCGT 
ATGGCAGCAC 
TTCTGTGACT 
GGCGACCGAG 
CATAGCAGAA 
AAAACTCTCA 
CTCGTGCACC 
GGGTGAGCAA 
GACACGGAAA 
GCATTTATCA 
TAGAAAAATA 
AC 



TTCTACGGGG 
TGGTCATGAG 
AAATGAAGTT 
CAGTTACCAA 
TTCGTTCATC 
CGGGAGGGCT 
ACGCTCACCG 
CCGAGCGCAG 
AATTGTTGCC 
CAACGTTGTT 
GTATGGCTTC 
TCCCCCATGT 
TGTCAGAAGT 
TGCATAATTC 
GGTGAGTACT 
TTGCTCTTGC 
CTTTAAAAGT 
AGGATCTTAC 
CAACTGATCT 
AAACAGGAAG 
TGTTGAATAC 
GGGTTATTGT 
AACAAATAGG 



TCTGACGCTC 
ATTATCAAAA 
TTAAATCAAT 
TGCTTAATCA 
CATAGTTGCC 
TACCATCTGG 
GCTCCAGATT 
AAGTGGTCCT 
GGGAAGCTAG 
GCCATTGCTA 
ATTCAGCTCC 
TGTGCAAAAA 
AAGTTGGCCG 
TCTTACTGTC 
CAACCAAGTC 
CCGGCGTCAA 
GCTCATCATT 
CGCTGTTGAG 
TCAGCATCTT 
GCAAAATGCC 
TCATACTCTT 
CTCATGAGCG 
GGTTCCGCGC 



AGTGGAACGA 
AGGATCTTCA 
CTAAAGTATA 
GTGAGGCACC 
TGACTCCCCG 
CCCCAGTGCT 
TATCAGCAAT 
GCAACTTTAT 
AGTAAGTAGT 
CAGGCATCGT 
GGTTCCCAAC 
AGCGGTTAGC 
CAGTGTTATC 
ATGCCATCCG 
ATTCTGAGAA 
TACGGGATAA 
GGAAAACGTT 
ATCCAGTTCG 
TTACTTTCAC 
GCAAAAAAGG 
CCTTTTTCAA 
GATACATATT 
ACATTTCCCC 



9400 

9450 

9500 

9550 

9600 

9650 

9700 

9750 

9800 

9850 

9900 

9950 

10000 

10050 

10100 

10150 

10200 

10250 

10300 

10350 

10400 

10450 

10500 

10512 



SEQ ID NO:32 (pTnMod (Oval/ENT tag/Proins/PA) - QUAIL) 



CTGACGCGCC 

30 CGCAGCGTGA 
TTTCTTCCCT 
GCCATTGCAT 
CATGTCCAAC 
TAGTAATCAA 

35 CGTTACATAA 
CCCGCCCATT 
GGGACTTTCC 
CTTGGCAGTA 
TCAATGACGG 

40 TGGGACTTTC 
CATGG.TGATG 
ACTCACGGGG 
TTTTGGCACC 
CCCATTGACG 

45 GCAGAGCTCG 
TGTTTTGACC 
GGAACGGTGC 
CCGCCTATAG 
CTGTTTTTGG 

50 ATGGTATAGC 
CCCCTATTGG 
GCCACAACTA 
TGACACGGAC 
AATTCACATA 

55 CATAaCGTGG 
CTCTT.CTCCG 
CTCCAGCGGC 
CCAGACTTAG 
GCCGTGGCGG 

60 CACGGCTGAC 
GCAGCTGAGT 

ATLUB02V33492.I 



CTGTAGCGGC 
CCGCTACACT 
TCCTTTCTCG 
ACGTTGTATC 
ATTACCGCCA 
TTACGGGGTC 
CTTACGGTAA 
GACGTCAATA 
ATTGACGTCA 
CATCAAGTGT 
TAAATGGCCC 
CTACTTGGCA 
CGGTTTTGGC 
ATTTCCAAGT 
AAAATCAACG 
CAAATGGGCG 
TTTAGTGAAC 
TCCATAGAAG 
ATTGGAACGC 
ACTCTATAGG 
CTTGGGGCCT 
TTAGCCTATA 
TGACGATACT 
TCTCTATTGG 
TCTGTATTTT 
TACAACAACG 
GATCTCCACG 
GTAGCGGCGG 
TCATGGTCGC 
GCACAGCACA 
TAGGGTATGT 
GCAGATGGAA 
TGTTGTATTC 



GCATTAAGCG 
TGCCAGCGCC 
CCACGTTCGC 
CATATCATAA 
TGTTGACATT 
ATTAGTTCAT 
ATGGCCCGCC 
ATGACGTATG 
ATG6GTGGAG 
ATCATATGCC 
GCCTGGCATT 
GTACATCTAC 
AGTACATCAA 
CTCCACCCCA 
GGACTTTCCA 
GTAGGCGTGT 
CGTCAGATCG 
ACACCGGGAC 
GGATTCCCCG 
CACACCCCTT 
ATACACCCCC 
GGTGTGGGTT 
TTCCATTACT 
CTATATGCCA 
TACAGGATGG 
CCGTCCCCCG 
CX5AATCTCGG 
AGCTTCCACA 
TCGGCAGCTC 
ATGCCCACCA 
GTCTGAAAAT 
GACTTAAGGC 
TGATAAGAGT 



CGGCGGGTGT 
CTAGCGCCCG 
CGGCATCAGA 
TATGTACATT 
GATTATTGAC 
AGCCCATATA 
TGGCTGACCG 
TTCCCATAGT 
TATTTACGGT 
AAGTACGCCC 
ATGCCCAGTA 
GTATTAGTCA 
TGGGCGTGGA 
TTGACGTCAA 
AAATGTCGTA 
ACGGTGGGAG 
CCTGGAGACG 
CGATCCAGCC 
TGCCAAGAGT 
TGGCTCTTAT 
GCTTCCTTAT 
ATTGACCATT 
AATCCATAAC 
ATACTCTGTC 
GGTCCCATTT 
TGCCCGCAGT 
GTACGTGTTC 
TCCGAGCCCT 
CTTGCTCCTA 
CCACCAGTGT 
GAGCGTGGAG 
AGCGGCAGAA 
CAGAGGTAAC 
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GGTGGTTACG 50 
CTCCTTTCGC 100 
TTGGCTATTG 150 
TATATTGGCT 200 
TAGTTATTAA 250 
TGGAGTTCCG 300 
CCCAACGACC 350 
AACGCCAATA 400 
AAACTGCCCA 450 
CCTATTGACG 500 
CATGACCTTA 550 
TCGCTATTAC 600 
TAGCGGTTTG 650 
TGGGAGTTTG 700 
ACAACTCCGC 750 
GTCTATATAA 800 
CCATCCACGC 850 
TCCGCGGCCG 900 
GACGTAAGTA 950 
GCATGCTATA XOOO 
GCTATAGGTG 1050 
ATTGACCACT 1100 
ATGGCTCTTT 1150 
CTTCAGAGAC 1200 
ATTATTTACA 1250 
TTTTATTAAA 1300 
CGGACATGGG 1350 
GGTCCCATGC 1400 
ACAGTGGAGG 1450 
GCCGCACAAG 1500 
ATTGGGCTCG 1550 
GAAGATGCAG 1600 
TCCCGTTGCG 1650 
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GTGCTGTTAA CGGTGGAGGG CAGTGTAGTC TGAGCAGTAC TCGTTGCTGC 1700 
CGCGCGCGCC ACCAGACATA ATAGCTGACA GACTAACAGA CTGTTCCTTT 1750 
CCATGGGTCT TTTCTGCAGT CACCGTCGGA CCATGTGTGA ACTTGATATT 1800 
TTACATGATT CTCTTTACCA ATTCTGCCCC GAATTACACT TAAAACGACT 1850 
5 CAACAGCTTA ACGTTGGCTT GCCACGCATT ACTTGACTGT AAAACTCTCA 1900 • 
CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAGCGAG AACAAAACAT 1950 
AACATCAAAC GAATCGACCG ATTGTTAGGT AATCGTCACC TCCACAAAGA 2000 
GCGACTCGCT GTATACCGTt GGCATGCTAG CTTTATCTGT TCGGGAATAC 2050 
GATGCCCATT GTACTTGTTG ACTGGTCTGA TATTCGTGAG CAAAAACGAC 2100 

10 TTATGGTATT GCGAGCTTCA GTCGCACTAC ACGGTCGTTC TGTTACTCTT 2X50 
TATGAGAAAG CGTTCCCGCT TTCAGAGCAA TGTTCAAAGA AAGCTCATGA 2200 
CCAATTTCTA GCCGACCTTG CGAGCATTCT ACCGAGTAAC ACCACACCX5C 2250 
TCATTGTCAG TGATGCTGGC TTTAAAGTGC CATGGTATAA ATCCGTTGAG 2300 
AAGCTGGGTT GGTACTGGTT AAGTCGAGTA AGAGGAAAAG TACAATATGC 2350 

15 AGACCTAGGA GCGGAAAACT GGAAACCTAT CAGCAACTTA CATGATATGT 24 00 
CATCTAGTCA CTCAAAGACT TTAGGCTATA AGAGGCTGAC TAAAAGCAAT 24 50 
CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTCGCTCTA AAGGCCGAAA 2500 
AAATCAGCGC TCGACACGGA CTCATTGTCA CCACCCGTCA CCTAAAATCT 2550 
ACTCAGCGTC GGCAAAGGAG CCATGGGTTC TAGCAACTAA CTTACCTGTT 2600 

20 GAAATTCGAA CACCCAAACA ACTTGTTAAT ATCTATTCGA AGCGAATGCA 2650 
GATTGAAGAA ACCTTCCGAG ACTTGAAAAG TCCTGCCTAC GGACTAGGCC 27 00 
TACGCCATAG CCGAACGAGC AGCTCAGAGC GTTTTGATAT CATGCTGCTA 2750 
ATCGCCCTGA TGCTTCAACT AACATGTTGG CTTGCGGGCG TTCATGCTCA 2800 
GAAACAAGGT TGGGACAAGC ACTTCCAGGC TAACACAGTC AGAAATCGAA 2850 

25 ACGTACTCTC AACAGTTCGC TTAGGCATGG AAGTTTTGCG GCATTCTGGC 2900 
TACACAATAA CAAGGGAAGA CTTACTCGTG GCTGCAACCC TACTAGCTCA 2950 
AAATTTATTC ACACATGGTT ACGCTTTGGG GAAATTATGA TAATGATCCA 3 000 
GATCACTTCT GGCTAATAAA AGATCAGAGC TCTAGAGATC TGTGTGTTGG 3050 
TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT GCCAGCCATC TGTTGTTTGC 3100 

30 CCCTGCCCCG TGCCTTCCTT GACCCTGGAA GGTGCCACTC CCACTGTCCT 3150 
TTCCTAATAA AATGAGGAAA TTGCATCGCA TTGTCTGAGT AGGTGTCATT 3200 
CTATTCTGGG GGGTGGGGTG GGGCAGCACA GCAAGGGGGA GGATTGGGAA 3250 
GACAATAGCA GGCATGCTGG GGATGCGGTG GGCTCTATGG GTACCTCTCT 3300 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCGGTAC CTCTCTCTCT 3350 

35 CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCAGG TGCTGAAGAA 3400 
TTGACCCGGT GACCAAAGGT GCCTTTTATC ATCACTTTAA AAATAAAAAA 3450 
CAATTACTCA GTGCCTGTTA TAAGCAGCAA TTAATTATGA TTGATGCCTA 3500 
CATCACAACA AAAACTGATT TAACAAATGG TTGGTCTGCC TTAGAAAGTA 3550 * 
TATTTGAACA TTATCTTGAT TATATTATTG ATAATAATAA AAACCTTATC 3600 

40 CCTATCCAAG AAGTGATGCC TATCATTGGT TGGAATGAAC TTGAAAAAAA 3650 
TTAGCCTTGA ATACATTACT GGTAAGGTAA ACGCCATTGT CAGCAAATTG 3700 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG ACGGAATGTT AATTCTCGTT 3750 
GACCCTGAGC ACTGATGAAT CCCCTAATGA TTTTGGTAAA AATCATTAAG 3800 
TTAAGGTGGA TACACATCTT GTCATATGAT CCCGGTAATG TGAGTTAGCT 3850 

45 CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 3900 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 3950 
CATGATTACG CCAAGCGCGC AATTAACCCT CACTAAAGGG AACAAAAGCT 4000 
GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG AACTAGTGGA TCCCCCGGGG 4050 
AGGTCAGAAT GGTTTCTTTA CTGTTTGTCA ATTCTATTAT TTCAATACAG 4100 

50 AACAAAAGCT TCTATAACTG AAATATATTT GCTATTGTAT ATTATGATTG 4150 
TCCCTCGAAC CATGAACACT CCTCCAGCTG AATTTCACAA TTCCTCTGTC 42 00 
ATCTGCCAGG CTGGAAGATC ATGGAAGATC TCTGAGGAAC ATTGCAAGTT 4 250 
CATACCATAA ACTCATTTGG AATTGAGTAT TATTTTGCTT TGAATGGAGC 4300 
TATGTTTTGC AGTTCCCTCA GAAGAAAAGC TTGTTATAAA GCGTCTACAC 4 350 

55 CCATC^AAAG ATATATTTAA ATATTCCAAC TACAGAAAGA TTTTGTCTGC 44 00 
TCTTCACTCT GATCTCAGTT GGTTTCTTCA CGTACATGCT TCTTTATTTG 4450 
CCTATrTTGT CAAGAAAATA ATAGGTCAAG TCCTGTTCTC ACTTATCTCC 4500 
TGCCTAGCAT GGCTTAGATG CACGTTGTAC ATTCAAGAAG GATCAAATGA 4550 
AACAGACTTC TGGTCTGTTA CAACAACCAT AGTAATAAAC AGACTAACTA 4600 

60 ATAATTGCTA ATTATGTTTT CCATCTCTAA GGTTCCCACA TTTTTCTGTT 4650 
TTAAGATCCC ATTATCTGGT TGTAACTGAA GCTCAATGGA ACATGAACAG 4700 
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TATTTCTCAG TCTTTTCTCC AGCAATCCTG ACGGATTAGA AGAACTGGCA '4 750 " 
GAAAACACTT TGTTACCCAG AATTAAAAAC TAATATTTGC TCTCCCTTCA 4 800 
ATCCAAAATG GACCTATTGA AACTAAAATC TGACCCAATC CCATTAAATT 4 850 
ATTTCTATGG CGTCAAAGGT CAAACTTTTG AAGGGAACCT GTGGGTGGGT 4900 
5 CCCA/i.TTCAG GCTATATATT CCCCAGGGCT GAGCCAGTGG ATCCATGGGC 4950 
TCCATCGGTG CAGCAAGCAT GGAATTTTGT TTTGATGTAT TCAAGGAGCT 5000 
CAAAdTCCAC CATGCCAATG ACAACATGCT CTACTCCCCC TTTGCCATCT 5050 
TGTCAACTCT GGCCATGGTC TTCCTAGGTG CAAAAGACA6 CACCAGGACC 5100 
CAGATAAATA AGGTTGTTCA CTTTGATAAA CTTCCAGGAT TCGGAGACAG 5150 

10 TATTC^AAGCT CAGTGTGGCA CATCTGTAAA TGTTCACTCT TCACTTAGAG 5200 
ACATACTCAA CCAAATCACC AAACAAAATG ATGCTTATTC GTTCAGCCTT 5250 
GCCAGTAGAC TTTATGCTCA AGAGACATAC ACAGTCGTGC CGGAATACTT 5300 
GCAATGTGTG AAGGAACTGT ATAGAGGAGG CTTAGAATCC GTCAACTTTC 5350 
AAACAGCTGC AGATCAAGCC AGAGGCCTCA TCAATGCCTG GGTAGAAAGT 54 00 

15 CAGACAAACG GAATTATCAG AAACATCCTT CAGCCAAGCT CCGTGGATTC 5450 
TCAAACTGCA ATGGTCCTGG TTAATGCCAT TGCCTTCAAG GGACTGTGGG 5500 
AGAAAGCATT TAAGGCTGAA GACACGCAAA CAATACCTTT CAGAGTGACT 5550 
GAGCAAGAAA GCAAACCTGT GCAGATGATG TACCAGATTG GTTCATTTAA 5600 
AGTGGCATCA ATGGCTTCTG AGAAAATGAA GATCCTGGAG CTTCCATTTG 5650 

20 CCAGTGGAAC AATGAGCATG TTGGTGCTGT TGCCTGATGA TGTCTCAGGC 57 00 
CTTGAGCAGC TTGAGAGTAT AATCAGCTTT GAAAAACTGA CTGAATGGAC 5750 
CAGTTCTAGT ATTATGGAAG AGAGGAAGGT CAAAGTGTAC TTACCTCGCA 5800 
TGAAGATGGA GGAGAAATAC AACCTCACAT CTCTCTTAAT GGCTATGGGA 5850 
ATTACTGACC TGTTCAGCTC TTCAGCCAAT CTGTCTGGCA TCTCCTCAGT 5900 

25 AGGGA.GCCTG AAGATATCTC AAGCTGTCCA TGCAGCACAT GCAGAAATCA 5950 
ATGAAGCGGG CAGAGATGTG GTAGGCTCAG CAGAGGCTGG AGTGGATGCT 6000 
ACTGAAGAAT TTAGGGCTGA CCATCCATTC CTCTTCTGTG TCAAGCACAT 6050 
CGAAACCAAC GCCATTCTCC TCTTTGGCAG ATGTGTTTCT CCGCGGCCAG 6100 
CAGATGACGC ACCAGCAGAT GACGCACCAG CAGATGACGC ACCAGCAGAT 6150 

30 GACGCACCAG CAGATGACGC ACCAGCAGAT GACGCAACAA CATGTATCCT 6200 
GAAAGGCTCT TGTGGCTGGA TCGGCCTGCT GGATGACGAT GACAAATTTG 6250 
TGAACCAACA CCTGTGCGGC TCACACCTGG TGGAAGCTCT CTACCTAGTG 6300 
TGCGGGGAAC GAGGCTTCTT CTACACACCC AAGACCCGCC GGGAGGCAGA 6350 
GGACCTGCAG 6TGGGGCAGG TGGAGCTGGG CGGGGGCCCT GGTGCAGGCA 6400 

35 GCCTGCAGCC CTTGGCCCTG GAGGGGTCCC TGCAGAAGCG TGGCATTGT6 6450 
GAACAATGCT GTACCAGCAT CTGCTCQCTC TACCAGCTGG AGAACTACTG 6500 
CAACTAGGGC GCCTGGATCC AGATCACTTC TGGCTAATAA AAGATCAGAG 6550 . 
CTCTAGAGAT CTGTGTGTTG GTTTTTTGTG GATCTGCTGT GCCTTCTAGT 6600 
TGCCAGCCAT CTGTTGTTTG CCCCTCCCCC GTGCCTTCCT TGACCCTGGA 6650 

40 AGGTGCCACT CCCACTGTCC TTTCCTAATA AAATGAGGAA ATTGCATCGC 6700 
ATTGTCTGAG TAGGTGTCAT TCTATTCTGG GGGGTGGGGT GGGGCAGCAC 6750 
AGCAAGGGGG AGGATTGGGA AGACAATAGC AGGCATGCTG GGGATGCGGT 6800 
GGGCTCTATG GGTACCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC 6850 
TCTCTCGGTA CCTCTCTCGA GGGGGGGCCC GGTACCCAAT TCGCCCTATA 6900 

45 GTGAGTCGTA TTACGCGCGC TCACTGGCCG TCGTTTTACA ACGTCGTGAC 6950 
TGGGAAAACC CTGGCGTTAC CCAACTTAAT CGCCTTGCAG CACATCCCCC 7000 
TTTCGCCAGC TGGCGTAATA GCGAAGAGGC CCGCACCGAT CGCCCTTCCC 7050 
AACAGTTGCG CAGCCTGAAT GGCGAATGGA AATTGTAAGC GTTAATATTT 7100 
TGTTAAAATT CGCGTTAAAT TTTTGTTAAA TCAGCTCATT TTTTAACCAA 7150 

50 TAGGCCGAAA TCGGCAAAAT CCCTTATAAA TCAAAAGAAT AGACCGAGAT 7200 
AGGGTTGAGT GTTGTTCCAG TTTGGAACAA GAGTCCACTA TTAAAGAACG 7250 
TGGACTCCAA CGTCAAAGGG CGAAAAACCG TCTATCAGGG CGATGGCCCA 7300 
CTACTCCGGG ATCATATGAC AAGATGTGTA TCCACCTTAA CTTAATGATT 7350 
TTTACCAAAA TCATTAGGGG ATTCATCAGT GCTCAGGGTC AACGAGAATT 7400 

55 AACATTCCGT CAGGAAAGCT TATGATGATG ATGTGCTTAA AAACTTACTC 7450 
AATGGCTGGT TATGCATATC GCAATACATG CGAAAAACCT AAAAGAGCTT 7500 
GCCGATAAAA AAGGCCAATT TATTGCTATT TACCGCGQCT TTTTATTGAG 7550 
CTTGAAAGAT AAATAAAATA GATAGGTTTT ATTTGAAGCT AAATCTTCTT 7600 
TATCGTAAAA AATGCCCTCT TGGGTTATCA AGAGGGTCAT TATATTTCGC 7650 

60 GGAATAACAT CATTTGGTGA CGAAATAACT AAGCACTTGT CTCCTGTTTA 7700 
CTCCCCTGAG CTTGAGGGGT TAACATGAAG GTCATCGATA GCAGGATAAT 7750 
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AATACAGTAA AACGCTAAAC CAATAATCCA AATCCAGCCA TCCCAAATTG 7 8 GO 
GTAGIGAATG ATTATAAATA ACAGCAAACA GTAATGGGCC AATAACACCG 7850 
GTTGC;ATTGG TAAGGCTCAC CAATAATCCC TGTAAAGCAC CTTGCTGATG 7900 
ACTCITTGTT TGGATAGACA TCACTCCCTG TAATGCAGGT AAAGCGATCC 7950 
5 CACCACCAGC CAATAAAATT AAAACAGGGA AAACTAACCA ACCTTCAGAT 8000 
ATAAACGCTA AAAAGGCAAA TGCACTACTA TCTGCAATAA ATCCGAGCAG 8050 
TACTGCCGTT TTTTCGCCCC ATTTAGTGGC TATTCTTCCT GCCACAAAGG 8100 
CTTGGAATAC TGAGTGTAAA AGACCAAGAC CCGCTAATGA AAAGCCAACC 8150 
ATCATGCTAT TCCATCCAAA ACGATTTTCG GTAAATAGCA CCCACACCGT 8200 

10 TGCGGGAATT TGGCCTATCA ATTGCGCTGA AAAATAAATA ATCAACAAAA 8250 
TGGCATCGTT TTAAATAAAG TGATGTATAC CGAATTCAGC TTTTGTTCCC 8300 
TTTAGTGAGG GTTAATTGCX3 CGCTTGGCGT AATCATGGTC ATAGCTGTTT 8350 
CCTGTGTGAA ATTGTTATCC GCTCACAATT CCACACAACA TACGAGCCGG 8400 
AAGCATAAAG TGTAAAGCCT GGGGTGCCTA ATGAGTGAGC TAACTCACAT 8450 

15 TAATTGCGTT GCGCTCACTG CCCGCTTTCC AGTCGGGAAA CCTGTCGTGC 8500 
CAGCTGCATT AATGAATCGG CCAACGCGCG GGGAGAGGCG GTTTGCGTAT 8550 
TGGGCGCTCT TCCGCTTCCT CGCTCACTGA CTCGCTGCGC TCGGTCGTTC 8600 
GGCTGCGGCG AGCGGTATCA GCTCACTCAA AGGCGGTAAT ACGGTTATCC 865 0 
ACAGAATCAG GGGATAACGC AGGAAAGAAC ATGTGAGCAA AAGGCCAGCA 87 00 

20 AAAGGCCAGG AACCGTAAAA AGGCCGCGTT GCTGGCGTTT TTCCATAGGC 8 75 0 
TCCGCCCCCC TGACGAGCAT CACAAAAATC GACGCTCAAG TCAGAGGTGG 8800 
CGAAA.CCCGA CAGGACTATA AAGATACCAG GCGTTTCCCC CTGGAAGCTC 8 850 
CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA TACCTGTCCG 8 900 
CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCATAGCTC ACGCTGTAGG 895 0 

25 TATCTCAGTT CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGTGCACGA 900 0 
ACCCC.CCGTT CAGCCCGACC GCTGCGCCTT ATCCGGTAAC TATCGTCTTG 9050 
AGTCCAACCC GGTAAGACAC GACTTATCGC CACTGGCAGC AGCCACTGGT 9100 
AACAGGATTA GCAGAGCGAG GTATGTAGGC GGTGCTACAG AGTTCTTGAA 9150 
GTGGTGGCCT AACTACGGCT ACACTAGAAG GACAGTATTT GGTATCTGCG 9200 

30 CTCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG CTCTTGATCC 9250 
GGCAAACAAA CCACCGCTGG TAGCGGTGGT TTTTTTGTTT GCAAGCAGCA 9300 
GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG ATCTTTTCTA 9350 
CGGGGTCTGA CGCTCAGTGG AACGAAAACT CACGTTAAGG GATTTTGGTC 9400 
AT6AGATTAT CAAAAAGGAT CTTCACCTAG ATCCTTTTAA ATTAAAAATG 9450 

35 AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG TCTGACAGTT 9500 
ACCAATGCTT AATCAGTQAG GCACCTATCT CAGCGATCTG TCTATTTCGT 9550 
TCATCCATAG TTQCCTGACT CCCCGTCGTG TAGATAACTA CGATACGGGA 9600 
GGGCTTACCA TCTGGCCCCA GTGCTGCAAT GATACCGCGA GACCCACGCT 9650 
CACCGGCTCC AGATTTATCA GCAATAAACC AGCCAGCCGG AAGGGCCGAG 9700 

40 CGCAGAAGTG GTCCTGCAAC TTTATCCGCC TCCATCCAGT CTATTAATTG 9750 
TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATAGT TTGCGCAACG 9800 
TTGTTGCCAT TGCTACAGGC ATCGTGGTGT CACGCTCGTC GTTTGGTATG 9850 
GCTTC.ATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA CATGATCCCC 9900 
CATGTTGTGC AAAAAAGCGG TTAGCTCCTT CGGTCCTCCG ATCGTTGTCA 9950 

45 GAAGTAAGTT GGCCGCAGTG TTATCACTCA TGGTTATGGC AGCACTGCAT 10000 
AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCTTTTCTG TGACTGGTGA 10050 
GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGCGA CCGAGTTGCT 10100 
CTTGCCCGGC GTCAATACGG GATAATACCG CGCCACATAG CAGAACTTTA 10150 
AAAGTGCTCA TCATTGGAAA ACGTTCTTCG GGGCGAAAAC TCTCAAGGAT 10200 

50 CTTACCGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT GCACCCAACT 10250 
GATCTTCAGC ATCTTTTACT TTCACCAGCG TTTCTGGGTG AGCAAAAACA 10300 
GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC GGAAATGTTG 10350 
AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT TATCAGGGTT 10400 
ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA 104 50 

55 ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCAC 10487 



SEQ ID NO: 33 (conalbumin polyA) 

tctgc.cattg ctgcttcctc tgcccttcct cgtcactctg aatgtggctt cttcgctact 

60 gccacagcaa gaaataaaat ctcaacatcc aaatgggttt cctgaggttt ttcaagagtc 

gtCaagcaca ttccttcccc agcacccctt gctgcaggcc agtgccaggc accaacttgg 
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ctactgctgc ccatgagaga aatccagttc aatattttcc aaagcaaaat ggattacata 
tgccctagat cctgattaac aggcgtttgt attatctagt gctttcgctt cacccagatt 
atcccdttgc ctccc 

5 • ■ 

SEQ ID NO:34 (exemplary antibody light chain sequence) 

1 gagctcgtga tgacccagac tccatccccc ctgtccgcct ctctgggaga cagagtcacc 
61 atcagttgca gggcaaacca ggacattagc aattatttaa actggtatca gcagaaacca 
121 gatggaactg ttaaactcct gatctactac acatcaagat tacactcagg ggtcccatca 

10 181 aggttcagtg gcagtgggtc tggaacagat tattctctca ccattagcaa cctggagcaa 
241 caagattttg ccacttactt ttgccaacag ggtaatacgc ttccgtggac gttcggtgga 
301 ggcaccaacc tggaaatcaa acgggccgat gctgcaccaa ctgtatccat cttcccacca 
361 tccagtgagc agttaacatc tggaggtgcc tcagtcgtgt gcttcttgaa caacttctac 
421 cccaaagaca tcaatgccaa gtggaagatt gatggcagtg aacgacaaaa tggcgtcctg 

15 4 81 aacagttgga ctgatcagga cagcaaagac agcacctaca gcatgagcag caccctcacg 
541 tcgaccaagg acgagtatga acgacataac agctatacct gtgaggccac tcacaagaca 
601 tcaacttcac ccattgtcaa gagcttcaac aggaatgagt gttaa 



20 SEQ ID NO: 35 (exemplary antibody heavy chain sequence) 

1 ctcgagtcag gacctggcct ggtggcgccc tcacagaacc tgtccatcac ttgcactgtc 
61 tctgggtttt cattaaccag ctatggtgta cactgggttc gccagcctcc aggaaagggt 
121 ctggaatggc tgggagtaat atggactggt agaagcacaa cttataattc ggctctcatg 
181 t'ccagactga gcatcagcaa agacaactcc aagagccaag ttttcttaaa aatgaacagt 

25 241 ctgcaaactg atgacacagc catttactac tgtggcagag ggggtctgat tacgtccttt 
301 gctatggact actggggtca aggaacctca gccaccgtct cctcagccaa aacgacaccc 
361 ccatctgtct atccactggc ccctggatct gctgcccaaa ctaactccat ggtgaccctg 
421 ggatgcctgg tcaagggcta tttccctgag ccagtgacag tgacctggaa ctctggatcc 
4 81 ctgtccagcg gtgtgcacac cttcccagct gtcctgcagt ctgacctcta cactctgagc 

30 541 agctcagtga ctgtcccctc cagcacctgg cccagcgaga ccgtcacctg caacgttgcc 
601 cacccggcca gcagcaccaa ggtggacaag aaaattgtgc ccagggattg tactagt 



SEQ ID NO: 36 (pTnMCS) 

35 1 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 
61 ccgccacact: tgccagcgcc ctagcgcccg ctcctttcgc ttccttccct tcctttctcg 
121 ccacgttcgc cggcatcaga ttggctattg gccactgcac acgttgtatc catatcataa 

181 tc.tgtacatt tatattggct catgtccaac attaccgcca tgttgacatt gattattgac 
241 tagttattaa tagtaaccaa tcacggggtc attagttcat agcccatata tggagttccg 

40 301 cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatc 
361 gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca 
421 atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc 
4 81 aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta 
541 catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 

45 601 catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggttcg actcacgggg 
661 atttccaagt ctccacccca ttgacgtcaa tgggagtctg ttttggcacc aaaatcaacg 
721 ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt 
781 acggcgggag gcctatataa gcagagctcg tttagcgaac cgtcagatcg cctggagacg 
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841 ccatccacgc tgttttgacc tccatagaag acaccgggac cgatccagcc tccgcggccg 
901 ggaacggtgc attggaacgc ggattccccg tgccaagagt gacgtaagta ccgcctatag 
961 actctatagg cacacccctt tggctcttat gcatgctata ctgcccttgg cttggggcct 
1021 atacaccccc gcttcct.tat gctataggtg atggtatagc ttagcctata ggtgtgggtt 
5 1081 attgaccact attgaccact cccctattgg tgacgatact ttccattact aaCccataac 
1141 atggctcttt gccacaacta tctctattgg ctatatgcca atactctgtc cttcagagac 
1201 tgacacggac tctgtatttc tacaggatgg ggtcccattt attatttaca aattcacata 
1261 uacaacaacg ccgtcccccg tgcccgcagt ttttattaaa catagcgcgg gacccccacg 
1321 cgaatcccgg gtacgtgttc cggacacggg ctcctctccg gtagcggcgg agcttccaca 

10 1381 nccgagccct ggtcccatgc ccccagcggc tcatggtcgc ccggcagctc cttgctccta 
1441 acagtggagg ccagacttag gcacagcaca atgcccacca ccaccagtgt gccgcacaag 
1501 <iccgtggcgg tagggtatgt gtctgaaaat gagcgtggag attgggctcg cacggctgac 
1561 gcagatggaa gacttaaggc agcggcagaa gaagatgcag gcagctgagt tgttgtattc 
1621 tgataagagc cagaggtaac tcccgttgcg gtgctgttaa cggtggaggg cagtgtagtc 

15 1681 tgagcagcac tcgttgctgc cgcgcgcgcc accagacata atagctgaca gactaacaga 
1741 ctgttccttt ccatgggtct tccctgcagt caccgtcgga ccatgtgcga actcgatatt 
1801 ctacacgact ctctctacca atcctgcccc gaattacact caaaacgact caacagctta 
1861 acgttggctt gccacgcatt acttgactgt aaaactctca cccttaccga acttggccgt 
1921 aacctgccaa ccaaagcgag aacaaaacat aacatcaaac gaatcgaccg attgttaggt 

20 1981 aatcgtcacc tccacaaaga gcgactcgct gcataccgtt ggcatgctag cCCCatctgt 
2041 tcgggcaata cgatgcccat cgtacttgtc gactggtctg atattcgtga gcaaaaacga 
2101 cttzacggtat cgcgagcttc agccgcacta cacggtcgtt ccgctactct tcatgagaaa 
2161 gcgtccccgc tttcagagca atgtccaaag aaagctcatg accaacccct agccgacctt 
2221 gcgagcattc taccgagtaa caccacaccg cccactgcca gcgatgctgg ctttaaagtg 

25 2281 ccatggcaca aatccgttga gaagctgggt tggtactggt taagtcgagt aagaggaaaa 
2341 gtzacaatatg cagacctagg agcggaaaac tggaaaccta tcagcaactt acatgatatg 
2401 ticatctagtc actcaaagac cccaggctat aagaggctga ctaaaagcaa tccaatctca 
2461 Cgccaaattc tattgtataa atctcgctct aaaggccgaa aaaatcagcg ctcgacacgg 
2521 nctcattgtc accacccgtc acctaaaatc tactcagcgt cggcaaagga gccatgggct 

30 2581 ctagcaacta acttacctgt tgaaattcga acacccaaac aacttgttaa tatctattcg 

2 641 aagcgaatgc agattgaaga aaccttccga gacttgaaaa gtcctgccta cggactaggc 
2701 ctacgccata gccgaacgag cagctcagag cgttttgata tcatgctgct aatcgccctg 
2761 c\tgcttcaac taacatgttg gcttgcgggc gttcatgctc agaaacaagg ttgggacaag 
2821 cacttccagg ctaacacagt cagaaatcga aacgtactct caacagttcg ctcaggcatg 

35 2881 gaagtttcgc ggcattctgg ctacacaata acaagggaag acttactcgt ggctgcaacc 
2941 ctactagctc aaaatttatt cacacatggt tacgctttgg ggaaattatg aggggaccgc 

3 001 tctagagcga tccgggatct cgggaaaagc gttggtgacc aaaggtgcct ttcatcatca 
3061 ctttaaaaat aaaaaacaac tactcagtgc ccgttataag cagcaattaa ttatgattga 
3121 tgcctacatc acaacaaaaa ctgatttaac aaatggttgg tctgccttag aaagtatatt 

40 3181 tgaacattat cttgattata ttattgataa Caataaaaac cCCaCcccta tccaagaagt 
3241 G|aCgcctacc attggttgga acgaacttga aaaaaattag ccttgaatac attactggta 
3301 aggtaaacgc cattgtcagc aaattgatcc aagagaacca acttaaagct ttcctgacgg 
3361 aatgttaacc ctcgttgacc ccgagcactg atgaatcccc taatgatttt ggtaaaaatc 
3421 attaagttaa ggtggaCaca catcttgtca tatgatcccg gtaatgtgag ttagctcact 

45 3481 cattaggcac cccaggcttc acactttatg cttccggctc gtatgttgtg tggaattgtg 
3541 agcggataac aatttcacac aggaaacagc tatgaccatg attacgccaa gcgcgcaatt 
3601 aaccctcact aaagggaaca aaagctggag ctccaccgcg gtggcggccg ctctagaact 
3 661 p.gtggatccc ccgggccgca ggaattcgat atcaagctta tcgataccgc tgacctcgag 
3721 cjgggggcccg gtacccaatt cgccctatag Cgagtcgtat tacgcgcgct cactggccgt 

50 3781 cgttttacaa cgtcgtgact gggaaaaccc tggcgttacc caacttaacc gccttgcagc 
3841 c-.catccccct ttcgccagct ggcgtaatag cgaagaggcc cgcaccgatc gcccttccca 
3901 c^.cagttgcgc agcctgaatg gcgaatggaa attgtaagcg ttaatatttt gttaaaattc 
3961 gcgttaaatt tttgtcaaat cagctcattt tttaaccaat aggccgaaat cggcaaaatc 
4021 ccttataaat caaaagaata gaccgagaca gggttgagtg ttgttccagt ttggaacaag 

SS 4081 p.gtccactat taaagaacgt ggactccaac gtcaaagggc gaaaaaccgt ctatcagggc 
4141 ciatggcccac taccccggga tcatatgaca agatgtgtaC ccaccttaac ttaatgattt 
4201 Ctaccaaaat cattagggga ttcatcagtg ctcagggtca acgagaatta acattccgtc 
4261 aggaaagctt atgatgatga tgcgcttaaa aacctaccca acggctggtt atgcatatcg 
4321 caatacatgc gaaaaaccta aaagagcttg ccgataaaaa aggccaattt attgctattt 

60 4381 accgcggctt tttattgagc ttgaaagata aataaaatag ataggtttta tttgaagcta 
4441 aatcttcttt atcgtiaaaaa atgccctctt gggttaccaa gagggtcatt atatttcgcg 
4501 gaataacatc atttggtgac gaaataacta agcacttgtc tcctgtctac tcccctgagc 
4561 ttgaggggtt aacatgaagg tcatcgatag caggataata atacagtaaa acgctaaacc 
4621 aataatccaa atccagccat cccaaattgg cagtgaatga ttataaataa cagcaaacag 

65 4681 taatgggcca ataacaccgg ttgcattggt aaggctcacc aataatccct gtaaagcacc 
4741 ttgctgatga ctctttgttt ggatagacat cactccctgt aatgcaggta aagcgatccc 
4801 accaccagcc aataaaatta aaacagggaa aactaaccaa ccttcagaca taaacgccaa 
4861 aaaggcaaat gcactactat ctgcaataaa cccgagcagC accgccgttt: tttcgcccat 
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10 



15 



20 



25 



30 



35 



40 



4 92 
498 
504 
510 
516 
522 
528 
534 
54.0 
546 
552 
558 
564 
570 
576 
582 
588 
594 
600 
606 
612 
618 
624 
630 
636 
642 
648 
654 
660 
666 
672 
678 
684 
690 
696 
702 
708 
714 
720 
726 



r.tagtggcta 
gtaaCgaaaa 
<jtgctggatt: 
::gatgtatac 
■-.(iatggccat 
cgagccggaa 
attgcgccgc 
Lgaaccggcc 
ctcactgact 
gcggtaatac 
ggccagcaaa 
cgcccccctg 
ggactataaa 
accctgccgc 
catagcccac 
gcgcacgaac 
tccaacccgg 
agagcgaggt 
cictagaagga 
gtcggtagct 
aagcagcaga 
uggtctgacg 
aaaaggatct 
acatacgagt 
cicgatctgtc 
.'.tacgggagg 
ccggctccag 
cctgcaactt 
agttcgccag 
cgctcgtcgt 
tgatccccca 
agtaagttgg 
gtcatgccat 
gaatagtgta 
ccacatagca 
tcaaggatct 
tcktcagcat 
gccgcaaaaa 
caatattact 
atctagaaaa 



ttcttccCgq 
gccaaccatc 
ggctatcaac 
cgaccagctc 
agctgtttcc 
gcataaagtg 
gctcactgcc 
aacgcgcggg 
-cgctgcgctc 
ggttacccac 
aggccaggaa 
acgagcatca 
gataccaggc 
ttaccggata 
gctgtaggta 
cccccgttca 
taagacacga 
atgtaggcgg 
cagtatttgg 
cttgatccgg 
ttacgcgcag 
ctcagcggaa 
tcacctagat 
aaacttggcc 
tatttcgttc 
gcttaccatc 
atttatcagc 
catccgcctc 
ttaatagttt 
ttggtatggc 
tgctgcgcaa 
ccgcagtgtt 
ccgtaagatg 
tgcggcgacc 
gaactttaaa 
taccgctgtt 
cttttacttt 
agggaaCaag 
gaagcattta 
ataaacaaat 



gacaaaggct 
atgccatcca 
gcgccgaaat 
ttgttccctt 
^g^Stgaaat 
caaagcccgg 
cgctctccag 
gagaggcggc 
ggtcgttcgg 
agaatcaggg 
ccgtaaaaag 
caaaaatcga 
gtttccccct 
ccCgtccgcc 
tcCcagttcg 
gcccgaccgc 
cttatcgcca 
tgccacagag 
Catctgcgct 
caaacaaacc 
aaaaaaagga 
cgaaaactca 
ccttccaaac 
tgacagttac 
atccacagtc 
tggccccagt 
aataaaccag 
cacccagtct 
gcgcaacgtc 
ttcattcagc 
aaaagcggtt 
atcactcatg 
cttttctgtg 
gagctgctct 
agtgctcatc 
gagatccagt 
caccagcgtt 
ggcgacacgg 
tcagggttat 
aggggttccg 



Cggaatactg 
tcatcacgat 
aacaatcaac 
tagtgagggt 
tgttatccgc 
ggtgcctaac 
tc^ggaaacc 
ttgcgcactg 
ctgcggcgag 
gataacgcag 
gccgcgttgc 
cgctcaagtc 
ggaagctccc 
tttctccctt 
gtgtaggtcg 
tgcgcctcat 
ctggcagcag 
ttcttgaagt 
ctgctgaagc 
accgctggta 
tctcaagaag 
cgctaaggga 
taaaaatgaa 
caatgcttaa 
gcctgactcc 
gctgcaaCga 
ccagccggaa 
attaattgtt 
gttgccattg 
tccggttccc 
agctccttcg 
gttacggcag 
actggtgagt 
tgcccggcgt 
attggaaaac 
tcgatgtaac 
tctgggtgag 
aaatgttgaa 
cgtctcatga 
cgcacacccc 



agCgtaaaag 
Ctctgcaata 
aaatggcatic 
taattgcgcg 
tcacaattcc 
gagtgagcca 
tgtcgtgcca 
ggcgctcttc 
cggtatcagc 
gaaagaacat 
tggcgttttc 
agaggtggcg 
tcgtgcgctc 
cgggaagcgt 
ttcgctccaa 
ccggtaacta 
ccaccggtaa 
ggtggcccaa 
cagttacctt 
gcggtggttt 
atcccttgat 
ttttggtcat 
gccctaaatc 
tcagtgaggc 
ccgtcgtgta 
taccgcgaga 
gggccgagcg 
gccgggaagc 
ctacaggcaC 
aacgatcaag 
gtcctccgat 
caccgcataa 
actcaaccaa 
caatacggga 
gcccctcggg 
ccactcgtgc 
caaaaacagg 
tactcatact 
gcggatacat 
cccgaaaagt 



accaagaccc 
gcaccacacc 
gCCaaataag 
cttggcgtaa 
acacaacata 
actcacatca 
gctgcatcaa 
cgctccctcg 
tcactcaaag 
gtgagcaaaa 
ccataggctc 
aaacccgaca 
tcctgttccg 
ggcgctttct 
gctgggccgt 
tcgtcttgag 
caggattagc 
ctacggctac 
cggaaaaaga 
ttttgttcgc 
ctcttccacg 
gagattatca 
aatctaaagt 
acctatctca 
gataactacg 
cccacgctca 
cagaagtggc 
tagagtaagc 
cgtggtgtca 
gcgagttaca 
cgttgtcaga 
ttctcttact 
gtcattctga 
taataccgcg 
gcgaaaactc 
acccaactga 
aaggcaaaat 
ctcccttctc 
atttgaatgt 
gccac 



45 



50 



55 



60 



SEQ ID NO: 37 (chicken 
ccgggbtgca gaaaaatgcc 
cttgajbctga tacctgattt 
cagagagaaa ccatcactga 
attca'tctgt gacctgagca 
atgaaaaggc aatttccaca 
tgctccttcc taatgtcaaa 
gtaggcttta gtgattggat 
ttttggataa aaagtgcttt 
tggtttaggg acagacccac 
ctgacctttt cttgggacaa 
ttgcacagct gtgctgggca 
gcaagaagat tgttgcttac 



ovalbumin ehancer) 
aggtggacta tgaactcaca 
tcttcaaact ggggaaacaa 
tggctacagc accaaggtat 
aaatgattta tctctccatg 
ctcacaatat gcaacaaaga 
attgtagtgg caaagaggag 
aagaggcttt gacctgtgag 
tataactttc aggtctccga 
aatgaaatgc ctggcatagg 
gcactgtcaa acaatgtgtg 
gggcaatcca ttgccaccta 
tctctctaga 



tccaaaggag 
cacaatccca 
gcaatggcaa 
aatggttgct 
caaacagaga 
aacaaaatct 
ctcacctgga 
gtctttattc 
aaagggcagc 
acaaaactat 
tcccaggtaa 



caaaacagct 
tccattcgac 
tctttccctc 
acaattaatg 
caagttctga 
cttcatatcc 
atgagactgt 
agagccttag 
ttgtactgct 
ccttccaact 



SEQ ID NO: 38 (5' untranslated region) 

GTGGATCAACATACAGCTAGAAAGCTGTATTGCCTTTAGCACTCAAGCTCAAAAGACAACTCAGAGTTC 
ACC 

SEQ ID NO:39 (putative cap site) 

ACATA'CAGCTAG AAAGCTGTAT TGCCTTTAGC ACTCAAGCTC AAAAGACAAC TCAGAGTTCA 



SEQ 10 NO: 40 (fragment of ovalbumin promoter - chicken) 
GAGGTCAGAAT GGTTTCTTTA CTGTTTGTCA ATTCTATTAT TTCAATACAG 
65 AACAATAGCT TCTATAACTG AAATATATTT GCTATTGTAT ATTATGATTG 
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TCCCTCGAAC CATGAACACT ' 
ATCTGCCAGG CCATTAAGTT 
AGTTCATATC ATAAACACAT 
GAGCTATGTT TTGCTGTATC 
5 ACACCCATAA AAAGATAGAT 
CGTCTGCTCT TCACTCTAGT 
TTATTTCTCC TATTTTGTCA 
. . TTATGTCCTG CCTAGCATGG 
TCAAATGAAA CAGACTTCTG 

10 ACTAACTAAT AATTGCTAAT 
TTTCTGTTTT CTTAAAGATC 
GAACATGAGC AATATTTCCC 
GATTA.GCAGA ACAGGCAGAA 
TATTTGCTCT CCATTCAATC 

15 CCCAATCCCA TTAAATGATT 
GGAACCTGTG GGTGGGTCAC 
C 



CCTCCAGCTG AATTTCACAA TTCCTCTGTC 
ATTCATGGAA GATCTTTGAG GAACACTGCA 
TTGAAATTGA GTATTGTTTT GCATTGTATG 
CTCAGAAAAA AAGTTTGTTA TAAAGCATTC 
TTAAATATTC CAGCTATAGG AAAGAAAGTG 
CTCAGTTGGC TCCTTCACAT GCATGCTTCT 
AGAAAATAAT AGGTCACGTC TTGTTCTCAC 
CTCAGATGCA CGTTGTAGAT ACAAGAAGGA . 
GTCTGTTACT ACAACCATAG TAATAAGCAC 
TATGTTTTCC ATCTCTAAGG TTCCCACATT 
CCATTATCTG GTTGTAACTG AAGCTCAATG 
AGTCTTCTCT CCCATCCAAC AGTCCTGATG 
AACACATTGT TACCCAGAAT TAAAAACTAA 
CAAAATGGAC CTATTGAAAC TAAAATCTAA 
TCTATGGCGT CAAAGGTCAA ACTTCTGAAG 
AATTCAGGCT ATATATTCCC CAGGGCTCAG 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



catgacctta 
catggtgatg 
atttccaagt 
gyactttcca 
a<:ggtgggag 
Cijatccacgc 



tcttggcacc aaaatcaacg 
caaatgggcg gtaggcgtgt 
cgtcagatcg cctggagacg 
cgatccagcc tccgcggccg 



SEQ ID N0:41 pTnMCS (CMV-CHOVg-ent - Prolnsulin- synPA) 
1 ccgacgcgcc ctgcagcggc gcattaagcg cggcgggtgt ggcggttacg cgcagcgtga 
61 ccgccacact tgccagcgcc ctagcgcccg ctccctCcgc tttcttccct tcctttctcg 
121 ccacgttcgc cggcatcaga ttggctattg gccattgcat acgttgtatc catatcataa 
tatgtacact tatattggcc catgtccaac attaccgcca tgttgacatt gattattgac 
tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata cggagttccg 
cgttacataa cttacggtaa atggcccgcc cggctgaccg cccaacgacc cccgcccatt 
gacgtcaaca atgacgtatg ttcccatagt aacgccaaca gggactcccc attgacgcca 
atgggtggag catttacggt aaactgccca cttggcagta catcaagcgt atcatatgcc 
aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatc acgcccagca 
tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 
cggttttggc agtacatcaa tgggcgtgga cagcggtctg actcacgggg 
ctccacccca ttgacgtcaa tgggagttcg 
aaatgtcgta acaacCccgc cccattgacg 
gcctatataa gcagagctcg tttagtgaac 
tgttttgacc tccatagaag acaccgggac 
901 gCjaacggtgc attggaacgc ggattccccg tgccaagagt gacgtaagta ccgcctatag 
961 ai:tctatagg cacacccctt tggctcttat gcatgctata ctgtttttgg cttggggccc 
atacaccccc gcttccttat gctataggtg atggtatagc ttagcctata ggtgtgggtt 
attgaccatt attgaccact cccctattgg tgacgatact ttccattact aatccataac 
.'^tggctcttC gccacaacta tctctattgg ctatatgcca atactctgtc cttcagagac 
I'lgacacggac tctgtatctt t'acaggatgg ggtcccattt attatttaca aattcacata 
tacaacaacg ccgtcccccg tgcccgcagt ttttattaaa catagcgCgg gatccccacg 
cgaatctcgg gtacgtgttc cggacatggg ctcCCctccg gtagcggcgg agcttccaca 
tccgagccct ggtcccatgc ctccagcggc tcatggtcgc tcggcagctc cttgctccta 
acagtggagg ccagacttag gcacagcaca atgcccacca ccaccagtgt gccgcacaag 
gccgtggcgg tagggtatgt gtctgaaaat gagcgtggag attgggctcg cacggctgac 
gcagatggaa gactcaaggc agcggcagaa gaagatgcag gcagctgagt tgttgtattc 
tgataagagt cagaggtaac tcccgttgcg gtgctgttaa cggtggaggg cagtgtagtc 
tgagcagtac tcgttgctgc cgcgcgcgcc accagacata atagctgaca gactaacaga 
ctgttccttt ccatgggtct tttctgcagt caccgtcgga ccatgcgcga actcgatatt 
ttacacgact ctctttacca attctgcccc gaattacact taaaacgact caacagctta 
acgttggctt gccacgcatt acttgactgt aaaactctca ctcttaccga acttggccgt 
aacctgccaa ccaaagcgag aacaaaacat aacatcaaac gaatcgaccg attgttaggt 
aatcgtcacc tccacaaaga gcgactcgct gtataccgtt ggcacgccag ctttatctgt 
tcgggcaata cgatgcccat tgtacttgtt gactggtctg atattcgtga gcaaaaacga 
cjttatggcat tgcgagcttc agtcgcacta cacggtcgtt ctgttactct ttatgagaaa 
c'cgttcccgc tttcagagca atgttcaaag aaagctcatg accaatttct agccgacctt 
<rcgagcaCtc taccgagtaa caccacaccg ctcattgtca gtgatgctgg ctttaaagcg 
ccatggtata aatccgttga gaagctgggt cggcaccggt taagtcgagt aagaggaaaa 
gtacaatatg cagacctagg agcggaaaac tggaaaccta tcagcaactt acatgatatg 
r.catctagcc actcaaagac tttaggctat aagaggctga ctaaaagcaa tccaatccca 
t.gccaaattc tattgtataa atctcgctct aaaggccgaa aaaaccagcg ctcgacacgg 
t.ctcattgtc accacccgtc acctaaaatc cactcagcgt cggcaaagga gccatgggtt 
ctagcaacta acttacctgc tgaaattcga acacccaaac aacttgttaa Catctattcg 
aagcgaatgc agattgaaga aaccttccga gacttgaaaa gtcctgccta cggaccaggc 
rtacgccata gccgaacgag cagctcagag cgttttgata tcatgctgct aatcgccctg 
c^tgcttcaac taacatgctg gcttgcgggc gttcatgctc agaaacaagg ttgggacaag 
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2821 cacttccagg ctaacacagt .cagaaatcga aacgtactct caacagtccg cttaggcatg . 
2881 gaagtcttgc ggcattctgg ctacacaata acaagggaag acttactcgt ggctgcaacc 
2941 ctactagctc aaaatttatt cacacatggt tacgctttgg ggaaattatg aggggatcgc 
3001 tctagagcga tccgggatct cgggaaaagc gttggtgacc aaaggtgcct tttatcatca 
5 3061 ctttaaaaat aaaaaacaat tactcagtgc ctgttataag cagcaattaa ttatgattga 
3121 tgcctacatc acaacaaaaa ctgiatttaac aaatggttgg tctgccttag aaagcataCt 
"3181 tgaacattat cttgattata ttattgataa taacaaaaac cttatcccta t'ccaagaagt 
3241 gatgcctatc attggttgga atgaacttga aaaaaattag cctcgaatac attactggta 
-3301 aggtaaacgc cattgtcagc aaattgatcc aagagaacca .acttaaagct- ttcctgacgg 

10 3361 aatgttaatt ctcgttgacc ctgagcactg atgaatcccc taacgatttt ggtaaaaatc 
3421 attaagttaa ggcggataca catcttgtca tacgatcccg gtaatgtgag ttagcCcact 
3481 cattaggcac cccaggcttt acactttatg cttccggctc gtatgttgtg tggaattgtg 
3 541 agcggataac aatttcacac aggaaacagc tatgaccatg attacgccaa gcgcgcaatt 
3 601 aaccctcact aaagggaaca aaagctggag ctccaccgcg gtggcggccg ctctagaact 

15 3 661 agtggatccc ccgggcatca gattggctat tggccactgc atacgttgta tccatatcat 
3 721 aatatgtaca tttatattgg ctcatgtcca acattaccgc catgttgaca ttgattattg 
3 781 actagctact aatagtaatc aattacgggg tcattagttc atagcccata tatggagttc 
3841 cgcgttacat aacttacggt aaatggcccg cctggctgac cgcccaacga cccccgccca 

3 901 ttgacgtcaa taatgacgta tgttcccata gtaacgccaa tagggacttt ccattgacgt 
20 3 961 caatgggtgg agtatttacg gtaaactgcc cacttggcag tacatcaagt gtatcatatg 

4021 ccaagtacgc cccctaCCga cgtcaatgac ggtaaatggc ccgcctggca ttacgcccag 
4081 tacatgacct tatgggactt tcctacttgg cagtacatcc acgtatcagt catcgccact 
4141 accatggtga cgcggttttg gcagcacatc aatgggcgtg gatagcggtt tgactcacgg 
4201 ggatttccaa gtctccaccc cattgacgcc aatgggagtt tgttttggca ccaaaatcaa 
25 4261 cgggaccttc caaaatgtcg taacaactcc gccccattga cgcaaatggg cggtaggcgt 

4 321 gtacggtggg aggtctatat aagcagagct cgtttagtga accgtcagat cgcctggaga 
4381 cgccatccac gctgttttga cctccataga agacaccggg accgatccag cctccgcggc 
4441 cgggaacggt gcatcggaac gcggattccc cgtgccaaga gtgacgtaag taccgcctat 
4501 agactctata ggcacacccc tttggctctt atgcatgcta tactgttttt ggcttggggc 

30 4561 ctatacaccc ccgcttcctt atgctatagg tgatggtata gcttagccta taggtgtggg 
4621 ttattgacca ttattgacca ctcccctatt ggtgacgata ctttccatta ctaatccata 
4 681 acatggctct ttgccacaac tatctctatt ggctatatgc caatactctg tccttcagag 
4741 actgacacgg acCctgtatt tttacaggat ggggtcccat ttattattta caaattcaca 
4801 tatacaacaa cgccgtcccc cgtgcccgca gttcctatca aacatagcgt gggatctcca 

35 4861 cgcgaatctc gggtacgtgt tccggacatg ggctcttctc cggtagcggc ggagcttcca 
4921 catccgagcc ctggtcccat gcctccagcg gctcatggtc gctcggcagc tccttgctcc 
4981 taacagtgga ggccagactt aggcacagca caatgcccac caccaccagt gtgccgcaca 
5041 aggccgtggc ggtagggtat gtgtctgaaa atgagcgtgg agattgggct cgcacggctg 
5101 acgcagatgg aagacctaag gcagcggcag aagaagatgc aggcagctga gttgttgtat 

40 5161 tctgacaaga gtcagaggta actcccgttg cggtgctgtt aacggtggag ggcagtgtag 
5221 tctgagcagt acccgtCgct gccgcgcgcg ccaccagaca taatagctga cagactaaca 
5281 gactgttcct ttccatgggt cttttctgca gtcaccgtcg ggatccatgg gctccatcgg 
5341 cgcagcaagc atggaatttt gttttgatgt attcaaggag ctcaaagtcc accatgccaa 
5401 cgagaacatc ttctactgcc ccattgccat catgtcagcc ctagccatgg tatacctggg 

45 5461 tgcaaaagac agcaccagga cacagataaa taaggttgtt cgctttgata aacttccagg 
5521 attcggagac agtatcgaag ctcagtgtgg cacatctgta aacgttcact cttcacttag 
5581 agacatcctc aaccaaatca ccaaaccaaa tgatgtttat tcgttcagcc ttgccagtag 
5641 actttatgct gaagagagat acccaatcct gccagaatac ttgcagtgtg tgaaggaact 
5701 gtatagagga ggcCtggaac ctatcaactt tcaaacagct gcagatcaag ccagagagct 

50 5761 catcaattcc tgggtagaaa gtcagacaaa tggaattatc agaaatgtcc ttcagccaag 
5821 ctccgcggat tctcaaactg caatggttct ggttaatgcc accgtcttca aaggactgtg 
5861 ggagaaaaca tttaaggatg aagacacaca agcaatgcct ttcagagtga ctgagcaaga 
5941 aagcaaacct gtgcagatga tgtaccagat tggtttattt agagtggcat caatggcttc 
6001 tgagaaaatg aagatcctgg agcttccatt tgccagtggg acaatgagca tgttggtgct 

55 6061 gttgcctgat gaagtctcag gccttgagca gcttgagagt ataatcaact ttgaaaaact 
6121 gactgaatgg accagttcta atgttatgga agagaggaag atcaaagtgt actcacctcg 
6181 catgaagatg gaggaaaaat acaacctcac atctgtctta atggctatgg gcattactga 
6241 cgtgtttagc tcttcagcca atctgtctgg catctcctca gcagagagcc tgaagatatc 
6301 tcaagctgtc catgcagcac atgcagaaat caatgaagca ggcagagagg tggcagggtc 

60 6361 agcagaggcc ggagtggatg ctgcaagcgt ctctgaagaa tttagggctg accatccatt 
6421 cctcttctgt atcaagcaca tcgcaaccaa cgccgttctc ttctttggca gatgtgtttc 
6481 ccgcggccag cagatgacgc accagcagat gacgcaccag cagatgacgc accagcagat 
6541 gacgcaccag cagatgacgc accagcagat gacgcaacaa catgtatcct gaaaggctct 
6601 tgtggctgga tcggcctgct ggatgacgat gacaaatttg tgaaccaaca cctgtgcggc 
6661 tcacacctgg tggaagcCct ctacctagtg tgcggggaac gaggcttctt ctacacaccc 
6721 aagacccgcc gggaggcaga ggacctgcag gtggggcagg tggagctggg cgggggccct 
6781 ggtgcaggca gcctgcagcc cttggccctg gaggggcccc tgcagaagcg tggcattgtg 
6841 gaacaatgct gtaccagcat ctgctccctc taccagctgg agaactactg caactagggc 
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6901 gcctaaaggg cgaattatcg cggccgccct agaccaggcg cctggatcca gatcacctct . 
6961 ggctaacaaa agatcagagc tctagagatc tgtgcgctgg ttctttgtgg atctgctgtg 
7021 ccttctagtt gccagccatc Cgttgtttgc ccctcccccg tgccttcctt gaccctggaa 
7081 ggtgccactc ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt 
5 7141 aggtgtcatt ctattctggg gggtggggtg gggcagcaca gcaaggggga ^gactgggaa 
7201 gacaatagca ggcatgctgg ggatgcggtg ggctctatgg gtacctctct ctctctctct 
7261 ctctctctct ctctctctct ctctcggtac ctctctcgag ggggggcccg gtacccaatt 
7321 cgccctatag tgagtcgtat tacgcgcgct cactggccgt cgttttacaa cgtcgtgact 
7381 gggaaaaccc tggcgttacc caacttaatc gccttgcagc acatccccct ttcgccagct 

10 7441 ggcgtaatag cgaagaggcc cgcaccgatc gcccttccca acagttgcgc agcctgaatg 
7 SOI gcgaatggaa attgtaagcg ttaatatttt gttaaaattc gcgttaaatt tttgttaaat 
7561 cagctcattt tttaaccaat aggccgaaat cggcaaaatc ccttataaat caaaagaata 
7621 gaccgagata gggttgagtg ttgttccagt ttggaacaag agtccactat taaagaacgt 
7681 ggactccaac gtcaaagggc gaaaaaccgt ctatcagggc gatggcccac tactccggga 

15 7741 tcatatgaca agatgtgtat ccaccttaac ttaatgattt ttaccaaaat cattagggga 
7801 ttcatcagtg ctcagggtca acgagaatta acattccgtc aggaaagctc atgatgatga 
7861 tgtgctCaaa aacttactca atggctggtt atgcatatcg caatacatgc gaaaaaccta 
7921 aaagagcttg ccgataaaaa aggccaattt attgctattt accgcggctt tttattgagc 
7981 tCgaaagata aataaaatag ataggtttta tctgaagcta aatcttcttt atcgtaaaaa 

20 8041 atgccctctc gggtcatcaa gagggtcatC atatttcgcg gaataacatc aCttggtgac 
8101 gaaataacta agcacttgtc tcctgtttac tcccctgagc ctgaggggtt aacatgaagg 
8161 tcatcgatag caggataata atacagtaaa acgctaaacc aataatccaa atccagccat 
8221 cccaaattgg tagtgaatga ttataaataa cagcaaacag taatgggcca ataacaccgg 
8281 ttgcattggt aaggctcacc aataatccct gtaaagcacc ttgctgatga ctctttgttt 

25 8341 ggatagacat cactccctgt aatgcaggta aagcgatccc accaccagcc aataaaatta 
8401 aaacagggaa aactaaccaa ccttcagata taaacgctaa aaaggcaaat gcactactat 
6461 ctgcaataaa tccgagcagt actgccgttt tttcgcccat ttagtggcta ttcttcctgc 
6521 cacaaaggct tggaatactg agtgtaaaag accaagaccc gtaatgaaaa gccaaccatc 
6581 atgctattca tcatcacgat ttctgtaata gcaccacacc gtgctggatt ggctatcaat 

30 8641 gcgctgaaat aataatcaac aaatggcatc gttaaataag tgatgtatac cgatcagctt 
8701 ttgttccctt tagtgagggt taattgcgcg cttggcgtaa tcatggtcat agctgtttcc 
8761 tgtgtgaaat tgttatccgc tcacaattcc acacaacata cgagccggaa gcataaagtg 
8821 taaagcctgg ggtgcctaat gagtgagcta actcacatta attgcgttgc gctcactgcc 
8881 cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg 

35 8941 gagaggcggt ttgcgtattg ggcgctctcc cgcttcctcg ctcactgact cgctgcgctc 
9001 ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac 
9061 agaatcaggg gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa 
9121 ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca 
9181 caaaaatcga cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc 

40 9241 gttcccGcct ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata 
9301 cctg'tccgcc tttctccctt cgggaagcgt ggcgctttct catagctcac gctgtaggta 
9361 tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgacic cccccgttca 
9421 gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga 
9481 cttatcgcca ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg 

45 9541 tgctacagag ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg 
9601 tatctgcgct ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg 
9661 caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 
9721 aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa 
9781 cgaaaactca cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat 

50 9841 ccttttaaat taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc 
9901 tgacagttac caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc 
9961 atccatagtt gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc 
10021 tggccccagt gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc 
10081 aataaaccag ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc 

55 10141 catccagtct attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt 
10201 gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc 
10261 ttcattcagc tccggttccc aacgatcaag gcgagtcaca tgatccccca tgttgtgcaa 
10321 aaaagcggtt agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt 
10381 atcactcatg gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg 

60 10441 cttttctgcg actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc 
10501 gagttgctct tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa 
10561 agtgctcatc attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt 
10621 gagatccagt tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt 
10681 caccagcgtt tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag 

65 10741 ggcgacacgg aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta 
10801 tcagggttat tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat 
10861 aggggttccg cgcacatttc cccgaaaagt gccac 
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SEQ ID NO:42(pTnMOD (CMV-CHOVg-ent-ProInsulin-synPA) ) 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



1 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggcgt ggtggttacg cgcagcgtga 
61 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 
121 ccacgttcgc cggcatcaga ttggctattg gccattgcat acgttgtatc catatcataa 
181 tatgtacatt tacattggct catgtccaac attaccgcca tgttgacatt gattactgac 
241 Jtagttattaa tagtaatcaa tcacggggtc attagttcat agcccatata tggagttccg 
301 cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt 
361 gacgtcaata atgacgtatg tccccatagt aacgccaata gggaccttcc attgacgtca 
421 atgggtggag tacttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc 
481 aagtacgccc cctattgacg tcaatgacgg Caaacggccc gccCggcatt atgcccagca 
541 catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 
COX catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg 
i;61 atttccaagt ccccacccca ttgacgtcaa tgggagtttg tttcggcacc aaaatcaacg 
'/21 ggactttcca aaatgccgta acaactccgc cccactgacg caaacgggcg gtaggcgtgc 
781 acggtgggag gtctatataa gcagagctcg tctagcgaac cgtcagatcg cctggagacg 
841 ccatccacgc tgctttgacc tccatagaag acaccgggac cgatccagcc tccgcggccg 
5*01 ggaacggcgc atcggaacgc ggattccccg tgccaagagt gacgtaagta ccgcctatag 
V61 actctatagg cacacccctt tggctcttat gcatgctata ccgctctcgg cttggggccc 
■J.021 atacaccccc gcttccttat gctataggtg atggtatagc ttagcctata ggtgtgggtt 
1081 actgaccatt attgaccact cccctattgg tgacgatact ttccattact aatccataac 
1141 atggctcttc gccacaacta tctctattgg ctatatgcca atactctgtc cctcagagac 
1201 tgacacggac tctgtatttt tacaggatgg ggtcccattt attacttaca aattcacata 
1261 tacaacaacg ccgtcccccg tgcccgcagt ctttattaaa catagcgtgg gatctccacg 
1321 cgaatctcgg gtacgtgttc cggacatggg ctcttctccg gtagcggcgg agcttccaca 
1381 tccgagccct ggtcccatgc ccccagcggc tcatggtcgc tcggcagctc cCtgctccta 
1441 acagcggagg ccagacttag gcacagcaca atgcccacca ccaccagtgt gccgcacaag 
1501 gccgcggcgg tagggcatgt gtccgaaaat gagcgtggag attgggctcg cacggctgac 
1561 gcagatggaa gacttaaggc agcggcagaa gaagatgcag gcagctgagt tgttgtatCC 
1621 tgataagagt cagaggtaac tcccgctgcg gtgccgttaa cggtggaggg cagtgtagtc 
1681 tgagcagtac ccgttgccgc cgcgcgcgcc accagacata atagctgaca gactaacaga 
1741 ctgCtccttt ccatgggtct tttctgcagt caccgtcgga ccatgtgtga actcgatatt 
1801 ttacatgatt ctctttacca actctgcccc gaattacact taaaacgact caacagctta 
?.861 acgttggctt gccacgcatt acttgactgt aaaactctca ctcttaccga acttggccgt 
1921 aacctgccaa ccaaagcgag aacaaaacat aacatcaaac gaatcgaccg attgttaggt 
: 981 aatcgtcacc tccacaaaga gcgactcgct gtataccgtt ggcatgctag ctttatctgt 
::041 tcgggcaata cgatgcccat tgtacttgtt gactggtctg atattcgtga gcaaaaacga 
r.lOl cttatggtac tgcgagcttc agtcgcacta cacggtcgtt ctgttactct ttatgagaaa 
2161 gcgttcccgc tttcagagca atgttcaaag aaagctcatg accaatttct agccgacctt 
2221 gcgagcattc taccgagtaa caccacaccg ctcattgtca gcgatgctgg ctttaaagtg 
:t281 ccatggtata aatccgttga gaagctgggt tggtactggt taagtcgagt aagaggaaaa 
1341 gtacaatatg cagacccagg agcggaaaac cggaaaccta tcagcaactt acatgatatg 
1401 tcatctagtc actcaaagac tttaggcCat aagaggctga cCaaaagcaa tccaatctca 
2461 tgccaaaCtc tattgtataa atctcgctct aaaggccgaa aaaatcagcg ctcgacacgg 
2521 actcattgtc accacccgtc acctaaaatc tactcagcgt cggcaaagga gccatgggtt 
2581 ctagcaacta acttaccCgt tgaaattcga acacccaaac aacctgttaa tatctatccg 
2641 aagcgaacgc agattgaaga aaccttccga gacCtgaaaa gtcctgccta cggactaggc 
2701 ctacgccata gccgaacgag cagctcagag cgttttgata tcatgctgct aatcgccctg 
2761 aCgcttcaac taacatgttg gcttgcgggc gttcatgctc agaaacaagg ttgggacaag 
2821 cacttccagg ctaacacagt cagaaatcga aacgtactct caacagttcg cttaggcatg 
2881 gaagctttgc ggcattctgg ctacacaata acaagggaag acttactcgt ggctgcaacc 
2941 ctactagctc aaaatttatt cacacatggt tacgctttgg ggaaattatg ataacgaccc 
3001 agatcacttc tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg 
3061 gatctgctgt gccttctagt tgccagccat ctgttgtttg cccctccccc gtgccttcct 
3121 tgaccccgga aggtgccact cccactgtcc tttcctaata aaatgaggaa attgcatcgc 
3181 attgtctgag taggtgtcat tctattctgg ggggtggggt ggggcagcac agcaaggggg 
3241 aggattggga agacaatagc aggcatgctg gggatgcggt gggctctatg ggtacctctc 
:30l tctctctctc tctctctctc tctctctctc tctctcggta cctctctctc tctctctctc 
?361 tctctctctc tctctctctc tcggtaccag gtgctgaaga attgacccgg tgaccaaagg 
3421 tgccttttat catcacttta aaaataaaaa acaattactc agtgcctgtt ataagcagca 
3481 attaattatg attgatgcct acatcacaac aaaaactgat ttaacaaatg gttggtctgc 
3541 cttagaaagt atatttgaac attatcttga ttatattatt gataataata aaaaccttat 
3*601 ccctatccaa gaagtgatgc ctatcattgg ttggaatgaa cttgaaaaaa attagccttg 
?'661 aatacattac tggtaaggta aacgccattg tcagcaaatt gatccaagag aaccaactta 
?.72l aagctttcct gacggaatgt taattctcgt tgaccctgag cactgatgaa tcccctaatg 
3 781 attttggtaa aaatcattaa gttaaggtgg atacacatct tgtcatatga tcccggtaat 
3841 gtgagttagc ccactcatta ggcaccccag gcctcacacC ttatgcttcc ggctcgtatg 
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390 
396 
402 
406 
'114 
•\20 
'126 
•132 
•138 
444 
450 

462 
468 
474 

480 
486 
492 
498 
504 
510 
516 
522 
528 
534 
540 
546 
S52 
??58 
:i64 
r.70 
576 
582 
•'.88 
:;94 
o'OO 
606 
612 
^18 
624 
630 
636 
642 
648 
654 
660 
666 
672 
678 
684 
t^90 

ri96 

702 
708 
714 
720 
•'26 

:32 

738 
744 
750 
756 
762 
768 
774 
780 
786 
792 



tt^tgtggaa 
gccaagcgcg 
ggccgctcta 
ttgtatccat 
tgacattgat 
ccatatatgg 
aacgaccccc 
actttccatt 
caagcgtatc 
tggcactatg 
ttagtcatcg 
cggtttgact 
tggcaccaaa 
atgggcggta 
cagatcgcct 
tccagcctcc 
gtaagtaccg 
tttttggctt 
gcctataggt 
cattactaaC 
ctctgtccct 
acccacaaat 
agcgtgggat 
gcggcggagc 
gcagctcctc 
ccagtgcgcc 
gggctcgcac 
gctgagttgt 
tggagggcag 
gctgacagac 
catgggctcc 
agtccaccat 
caCggtatac 
cgataaactt 
tcactcttca 
cagccttgcc 
gtgtgtgaag 
ccaagccaga 
tgtccttcag 
cttcaaagga 
agtgactgag 
ggcatcaatg 
gagcatgttg 
caacttcgaa 
agtgtactta 
tatgggcatt 
gagcctgaag 
agaggtggta 
ggctgaccat 
cggcagatgt 
gacgcaccag 
atcctgaaag 
caacacctgt 
Ctcttctaca 
ctgggcgggg 
aagcgtggca 
tactgcaact 
atccagatca 
cgtggatctg 
tccttgaccc 
tcgcattgtc 
ggggaggatc 
tctccctctc 
ggcccggtac 
tcacaacgtc 
ccccctttcg 
ttgcgcagcc 
taaatttttg 



ttgtgagcgg 
caattaaccc 
gaaccagtgg 
atcataatat 
tattgactag 
agttccgcgt 
gcccattgac 
gacgtcaatg 
atacgccaag 
cccagtacat 
ctattaccat 
cacggggatt 
atcaacggga 
ggcgtgtacg 
ggagacgcca 
gcggccggga 
cctatagact 
ggggcccata 
gtgggttatt 

ccataacatg 
cagagaccga 
tcacatatac 
ctccacgcga 
ctccacaccc 
gctcctaaca 
gcacaaggcc 
ggctgacgca 
tgtattctga 
tgtagtctga 
taacagactg 
atcggcgcag 
gccaacgaga 
ctgggtgcaa 
ccaggattcg 
cttagagaca 
agtagacttt 
gaactgtata 
gagctcatca 
ccaagctccg 
ctgtgggaga 
caagaaagca 
gcttctgaga 
gtgctgttgc 
aaactgactg 
cctcgcatga 
actgacgtgt 
atatctcaag 
gggtcagcag 
ccatccccct 
gtttcccgcg 
cagatgacgc 
gctcttgtgg 
gcggctcaca 
cacccaagac 
gccctggtgc 
ttgtggaaca 
agggcgccta 
ctcctggcta 
ctgtgcctcc 
kggaaggtgc 
tgagtaggtg 
gggaagacaa 
tctctctctc 
ccaattcgcc 
gtgactggga 
ccagctggcg 
tgaatggcga 
ttaaatcagc 



ataacaattt 
tcactaaagg 
atcccccggg 
gtacattcat 
ttattaatag 
tacataactt 
gtcaataatg 
ggcggagcat 
tacgcdccct 
gaccttatgg 
ggcgatgcgg 
tccaagtctc 
ctctccaaaa 
gcgggaggtc 
tccacgccgt 
acggtgcatt 
ctacaggcac 
cacccccgct 
gaccattatt 
gctctttgcc 
cacggaccct 
aacaacgccg 
atctcgggta 
gagccctggt 
gtggaggcca 
gtggcggtag 
gatggaagac 
taagagtcag 
gcagtacccg 
ttcctttcca 
caagcatgga 
acatcttcta 
aagacagcac 
gagacagtat 
tcctcaacca 
atgctgaaga 
gaggaggcct 
attcctgggt 
tggattctca 
aaacatttaa 
aacccgtgca 
aaacgaagat 
ctgatgaagt 
aatggaccag 
agatggagga 
ctagctcttc 
ctgtccatgc 
aggctggagt 
tctgtatcaa 
gccagcagac 
accagcagat 
ctggatcggc 
cctggtggaa 
ccgccgggag 
aggcagcctg 
atgccgtacc 
aagggcgaat 
ataaaagatc 
tagttgccag 
cactcccact 
ccattccatt 
tagcaggcat 
tctctccctc 
ctatagtgag 
aaaccccggc 
taatagcgaa 
atggaaattg 
tcatttttta 
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cacacaggaa 
gaacaaaagc 
catcagattg 
attggctcac 
taatcaatca 
acggtaaatg 
acgtatgttc 
ttacggtaaa 
attgacgcca 
gactttccta 
ttttggcagt 
caccccactg 
tgtcgtaaca 
tatataagca 
tttgacctcc 
ggaacgcgga 
acccctttgg 
tccttatgct 
gaccactccc 
acaactatcc 
gtattttcac 
tcccccgtgc 
cgcgctccgg 
cccatgcctc 
gacttaggca 
ggtatgtgtc 
ttaaggcagc 
aggtaactcc 
ctgctgccgc 
tgggtctttt 
attttgtttt 
ctgccccatt 
caggacacag 
tgaagctcag 
aatcaccaaa 
gagataccca 
ggaacctatc 
agaaagtcag 
aactgcaatg 
ggatgaagac 
gatgatgCac 
cctggagctt 
ctcaggcctt 
ttctaatgtt 
aaaatacaac 
agccaatctg 
agcacatgca 
ggatgctgca 
gcacatcgca 
gacgcaccag 
gacgcaccag 
ctgctggatg 
gctctctacc 
gcagaggacc 
cagcccttgg 
agcatctgct 
tatcgcggcc 
agagctctag 
ccatctgttg 
gtcctttcct 
ctggggggcg 
gctggggatg 
tctctctctc 
tcgtattacg 
gttacccaac 
gaggcccgca 
taagcgttaa 
accaataggc 



acagctatga 
tggagctcca 
gctattggcc 
gtccaacatt 
cggggccatt 
gcccgcctgg 
ccatagtaac 
ctgcccactt 
acgacggtaa 
cttggcagta 
acatcaatgg 
acgtcaatgg 
actccgcccc 
gagctcgttt 
atagaagaca 
ttccccgtgc 
cccttatgca 
ataggtgacg 
ctattggtga 
ctattggcca 
aggatggggt 
ccgcagt tt t 
acatgggctc 
cagcggctca 
cagcacaatg 
tgaaaatgag 
ggcagaagaa 
cgttgcggtg 
gcgcgccacc 
ctgcagtcac 
gatgtattca 
gccatcatgt 
ataaataagg 
cgtggcacac 
ccaaaCgatg 
atcctgccag 
aactttcaaa 
acaaatggaa 
gttctggtta 
acacaagcaa 
cagattggtt 
ccatttgcca 
gagcagcttg 
atggaagaga 
ctcacatctg 
tctggcatct 
gaaatcaatg 
agcgtctctg 
accaacgccg 
cagatgacgc 
cagatgacgc 
acgatgacaa 
tagtgtgcgg 
tgcaggtggg 
ccctggaggg 
ccctctacca 
gctctagacc 
agatctgtgc 
tttgcccctc 
aataaaatga 
gggtggggca 
cggtgggctc 
ggtacctctc 
cgcgctcact 
ttaatcgcct 
ccgatcgccc 
tatttcgtta 
cgaaatcggc 



ccatgattac 
ccgcggtggc 
attgcatacg 
accgccatgt 
agttcatagc 
ctgaccgccc 
gccaacaggg 
ggcagtacat 
atggcccgcc 
catctacgta 
gcgtggatag 
gagtttgCtt 
attgacgcaa 
agtgaaccgt 
ccgggaccga 
caagagtgac 
tgctatactg 
gtatagctta 
cgacactttc 
tatgccaaca 
cccatctact 
cattaaacat 
ctctccggta 
tggtcgctcg 
cccaccacca 
cgtggagatt 
gatgcaggca 
ctgttaacgg 
agacataata 
cgtcgggatc 
aggagctcaa 
cagctctagc 
ttgttcgctt 
ctgtaaacgt 
tttattcgtt 
aatacttgca 
cagctgcaga 
ttatcagaaa 
atgccattgt 
tgcctttcag 
tatttagagt 
gtgggacaat 
agagtataat 
ggaagatcaa 
tcttaatggc 
cctcagcaga 
aagcaggcag 
aagaatttag 
ttccctcctt 
accagcagat 
aacaacatgt 
atttgtgaac 
ggaacgaggc 
gcaggtggag 
gtccctgcag 
gctggagaac 
aggcgcctgg 
gttggttttt 
ccccgtgcct 
ggaaattgca 
gcacagcaag 
tatgggtacc 
ctcgaggggg 
ggccgtcgtt 
tgcagcacat 
ttcccaacag 
aaatccgcgt 
aaaatccctt 
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/981 ataaatcaaa agaatagacc gagatagggt tgagtgttgfe tccagtttgg aacaagagtc 
,]041 cactattaaa gaacgtggac tccaacgtca aagggcgaaa aaccgtctaC cagggcgatg 
8101 gcccactact ccgggatcat atgacaagat gtgtatccac cttaacttaa tgatttttac 
B161 caaaatcatt aggggattca tcagtgctca gggtcaacga gaattaacat tccgtcagga 
'5 •»22l aagcttatga tgatgatgtg cttaaaaact tactcaatgg ctggttatgc a.tatcgcaac 

H2S1 acatgcgaaa aacctaaaag agcttgccga caaaaaaggc caatctattg ctatttaccg 
8341 cggccctcta ttgagctcga aagataaata aaatagatag gctttatttg aagctaaatc 
'8401 ttctttatcg taaaaaatgc cctcttgggt tatcaagagg gtcattatat ttcgcggaat 
B461 aacaccattt ggcgacgaaa taactaagca ctcgtctcct gtttactccc ctgagcttga 
10 8521 ggggttaaca tgaaggtcat cgatagcagg ataataatac agtaaaacgc caaaccaata 

8581 acccaaatcc agccatccca aattggtagt gaatgattat aaataacagc aaacagtaaC 
8641 gggccaataa caccggttgc attggtaagg ctcaccaata atccctgCaa agcaccttgc 
8701 tgatgactct tcgtttggat agacatcact ccctgtaatg caggtaaagc gatcccacca 
8761 ccagccaata aaattaaaac agggaaaact aaccaacctt cagatataaa cgctaaaaag 
15 8821 gcaaatgcac tactatctgc aataaatccg agcagtactg ccgttttttc gcccatttag 

8881 tggccattct tcctgccaca aaggcttgga atactgagtg taaaagacca agacccgtaa 
8941 tgaaaagcca accatcatgc tactcatcat cacgacctct gtaatagcac cacaccgtgc 
9001 tggattggct atcaatgcgc tgaaataata atcaacaaat ggcatcgtta aataagtgat 
9061 gcacaccgat cagctttcgt tccctttagt gagggtcaac tgcgcgcttg gcgtaatcat 
20 9121 ggtcatagct gttccccgtg cgaaattgtt atccgctcac aattccacac aacatacgag 

f>181 ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc acattaattg 
'^241 cgttgcgccc actgcccgcc ctccagtcgg gaaacccgtc gcgccagccg cattaacgaa 
:)3 01 tcggccaacg cgcggggaga ggcggcttgc gtattgggcg ctctcccgct tcctcgctca 
•]»361 ctgactcgct gcgctcggtc gtccggctgc ggcgagcggc accagctcac tcaaaggcgg 
25 9421 taatacggct atccacagaa ccaggggata acgcaggaaa gaacatgtga gcaaaaggcc 

9481 agcaaaaggc caggaaccgc aaaaaggccg cgttgctggc gtttttccat aggctccgcc 
**54l cccctgacga gcatcacaaa aaccgacgct caagtcagag gtggcgaaac ccgacaggac 
:j6QI tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct gttccgaccc 
1^661 tgccgcttac cggatacctg tccgcctttc ccccttcggg aagcgtggcg ctttctcata 
30 9721 gctcacgccg taggtaCccc agctcggtgc aggtcgttcg ccccaagctg ggctgtgtgc 

9781 acgaaccccc cgctcagccc gaccgctgcg ccttatccgg taactatcgt cttgagtcca 
9841 acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg attagcagag 
9901 cgaggtatgt aggcggtgcc acagagttct tgaagtggtg gcctaactac ggctacacta 
9961 gaaggacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga aaaagagctg 
35 10021 gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt gtttgcaagc 

10081 agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt tctacggggt 
10141 ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa 
10201 ggatcttcac ctagatcctt ttaaattaaa aatgaagttt caaaccaatc taaagtatat 
10261 atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct atctcagcga 
40 10321 cctgtctatt tcgtccatcc atagttgcct gactccccgt cgtgtagata actacgatac 

10381 gggagggctt accatctggc cccagtgctg caatgatacc gcgagaccca cgcccaccgg 
10441 ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga agtggtcctg 
10501 caactttatc cgcctccatc cagtctatta actgttgccg ggaagctaga gtaagtagtt 
10561 cgccagttaa tagtttgcgc aacgttgtcg ccattgctac aggcatcgtg gtgtcacgct 
45 11)621 cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga gtCacatgat 

10661 cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt gtcagaagta 
10741 agttggccgc agtgttatca cccatggtta tggcagcact gcataattct cttactgtca 
10801 tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca ctctgagaat 
10661 agtgtatgcg gcgaccgagt tgcCcttgcc cggcgtcaat acgggataat accgcgccac 
50 1<'921 atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga aaactctcaa 

V. 981 ggaCcttacc gccgttgaga tccagttcga tgtaacccac tcgtgcaccc aactgatctt 
11041 cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg 
11101 caaaaaaggg aacaagggcg acacggaaac gttgaatact catactcttc ctttttcaat 
13.161 attaccgaag catttatcag ggtcattgcc tcacgagcgg atacatattt gaatgtattt 
55 li221 agaaaaataa acaaataggg gttccgcgca cattCccccg aaaagcgcca c 



SEQ ID N0:43 (pTnMOD (Chicken OVep+OVg' +ENT+proins+syn polyA) ) 

60 

1 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 
61 ccgccacacc tgccagcgcc ctagcgcccg ctcctcccgc tttcttccct tcctttctcg 
121 ccacgttcgc cggcatcaga ttggctattg gccattgcat acgttgtatc catatcataa 
181 tatgtacatt tatattggct catgtccaac attaccgcca tgttgacatt gattattgac 
65 ••241 tagttattaa tagtaatcaa ttacggggtc attagtccat agcccacata tggagctccg 

•301 cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt 
'361 gacgccaata atgacgtatg tccccacagc aacgccaaCa gggactttcc attgacgtca 
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42] 


L atgggtggag 


48] 


L aagtacgccc 


54] 


L catgacctta 


60] 


L catggtgatg 


66] 


L atttccaagt 


121 


L. ggactttcca 


78] 


L acggcgggag 


84] 


L ccatccacgc 


90] 


. ggaacggtgc 


96] 


. actctatagg 


102] 


L atacaccccc 


108] 


L attgaccatc 


.114] 


I atggctcttt 


-.20] 


L tgacacggac 


:.26] 


. tacaacaacg 


.i32] 


. cgaatctcgg 


L3B3 


. tccgagccct 


1441 


. acaqtqgaqq 


7.501 


. gccgtggcgg 


:.561 


. gcagatggaa 


J. 621 


. tgataagagt 


•j.681 


. cgagcagtac 


1741 


. ctgttccctt 


1801 


. ttacacgact 


1861 


. acgttggctt 


1921 


. aacccgccaa 


1981 


. aaccgtcacc 


2041 


. tcgggcaata 


2102 


. cttatggtac 


2161 


. gcgttcccgc 


2221 


> gcgagcattc 


2281 


. ccatggtata 


2341 


gcacaatatg 


2401 


tcatctagtc 


2462 


tgccaaattc 


2 521 


actcattgtc 


:-581 


. ctagcaacta 


::64i 


. aagcgaatgc 


^701 


. ctacgccata 


2761 


. atgcttcaac 


2821 


. cactCccagg 


:'86] 


. gaagttttgc 


: 941 


. ctactagctc 


i^ooi 


. tctagagcga 


3061 


ctttaaaaat 


3121 


tgcctacatc 


3181 


tgaacattat 


3241 


gatgcctatc 


3301 


aggtaaacgc 


3361 


aatgttaatt 


3421 


attaagttaa 


3481 


cattaggcac 


3541 


agcggataac 


3601 


. aacccCcact 


3661 


agtggatccc 


3721 


cttgacctga 


3781 


. cagagagaaa 


3841 


. attcatctgt 


3:901 


atgaaaaggc 


•961 


. tgctccttcc 


4021 


. gtaggtttCa 


4081 


ttttggataa 


4141 


tggtttaggg 


^201 


ctgacctttt 


^261 


ttgcacagct 


<'-321 


gcaagaagat 


4381 


ggtagagata 


4441 


agaccctcct 



tat'ttacggt 
cctattgacg 
tgggactttc 
cggtcttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
tgctttgacc 
aCCggadcgc 
cacacccctt 
gcttccttat 
attgaccacc 
gccacaacCa 
tctgtatttt 
ccgtcccccg 
gtacgtgttc 
ggtcccatgc 
ccagacctag 
tagggtatgt 
gactcaaggc 
cagaggtaac 
tcgttgctgc 
ccatgggcct 
ctctctacca 
gccacgcatt 
ccaaagcgag 
tccacaaaga 
cgatgcccat 
tgcgagcttc 
tcccagagca 
taccgagtaa 
aatccgttga 
cagacctagg 
actcaaagac 
tattgtataa 
accacccgtc 
acttacctgt 
agatcgaaga 
gccgaacgag 
caacatgttg 
ctaacacagc 
ggcattctgg 
aaaatttatt 
tccgggatct 
aaaaaacaat 
acaacaaaaa 
cttgattata 
attggttgga 
cattgtcagc 
ctcgttgacc 
ggtggataca 
cccaggcttt 
aatttcacac 
aaagggaaca 
ccgggctgca 
tacccgattt 
ccatcactga 
gacctgagca 
aatttccaca 
taatgtcaaa 
gtgaccggaC 
aaagtgcttt 
acagacccac 
cttgggacaa 
gtgctgggca 
tgttgcttac 
acatttactg 
agtggctgaa 



aaactgcctra 
tcaatgacgg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctcg 
tccatagaag 
ggattccccg 
cggctcccat 
gctataggtg 
cccctattgg 
Cctctattgg 
tacaggacgg 
tgcccgcagt 
cggacatggg 
cCccagcggc 
gcacagcaca 
gtctgaaaat 
agcggcagaa 
tcccgttgcg 
cgcgcgcgcc 
cttctgcagt 
attctgcccc 
acttgactgt 
aacaaaacat 
gcgactcgct 
tgtacttgtt 
agtcgcacta 
atgttcaaag 
caccacaccg 
gaagctgggt 
agcggaaaac 
tttaggctat 
aCctcgcccc 
acckaaaatc 
Cgaaattcga 
aaccttccga 
cagctcagag 
gcttgcgggc 
cagaaatcga 
ctacacaata 
cacacatggt 
cgggaaaagc 
tactcagtgc 
ctgatttaac 
ttattgataa 
atgaacttga 
aaattgatcc 
ctgagcactg 
catcttgtca 
acactttacg 
aggaaacagc 
aaagctggag 
gaaaaatgcc 
ccttcaaact 
tggctacagc 
aaatgattta 
ctcacaatat 
attgtagtgg 
aagaggcttt 
tataactttc 
aatgaaacgc 
gcattgtcaa 
gggcaaccca 
tccctctaga 
ggaagcacat 
aCagaagcaa 
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cttggcagta 
taaatggccc 
gCacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
tttagtgaac 
acaccgggac 
tgccaagagt 
gcacgctaca 
atggtatagc 
tgacgacact 
ctatatgcca 

ggtCCCaCtt 

ttttattaaa 
cccc tctccg 
tcatggtcgc 
atgcccacca 
gagcgtggag 
gaagatgcag 
gtgctgttaa 
accagacata 
caccgtcgga 
gaattacacC 
aaaactccca 
aacatcaaac 
gtataccgtt 
gactggtctg 
cacggtcgtt 
aaagctcacg 
ctcattgtca 
cggtactggt 
tggaaaccta 
aagaggctga 
aaaggccgaa 
tactcagcgt 
acacccaaac 
gacttgaaaa 
cgtttcgata 
gttcatgctc 
aacgtactct 
acaagggaag 
tacgctttgg 
gttggtgacc 
ctgttataag 
aaatggctgg 
taataaaaac 
aaaaaattag 
aagagaacca 
atgaatcccc 
tatgatcccg 
cttccggctc 
tatgaccatg 
ctccaccgcg 
aggtggacta 
ggggaaacaa 
accaaggcat 
tctctccatg 
gcaacaaaga 
caaagaggag 
gacctgtgag 
aggtctccga 
ctggcatagg 
acaatgtgtg 
ttgccaccta 
aagctcctgc 
ctatcatcat 
aagacgtgat 



catcaagtgt 
gcctggcatc 
gtattagtca 
tagcggtctg 
ttttggcacc 
caaatgggcg 
cgtcagaccg 
cgatccagcc 
gacgtaagta 
ctgttcttgg 
ttagcctata 
ttccattact 
atactctgtc 
attatttaca 
catagcgtgg 
gtagcggcgg 
tcggcagctc 
ccaccagtgt 
accgggctcg 
gcagctgagt 
cggtggaggg 
acagctgaca 
ccatgtgcga 
Caaaacgact 
ctcttaccga 
gaatcgaccg 
ggcatgctag 
atattcgtga 
ctgttactct 
accaacttct 
gtgatgctgg 
taagtcgagt 
tcagcaactt 
ctaaaagcaa 
aaaatcagcg 
cggcaaagga 
aacttgttaa 
gtcctgccta 
tcatgctgct 
agaaacaagg 
caacagttcg 
acttactcgt 
ggaaattatg 
aaaggtgcct 
cagcaattaa 
tctgccttag 
cttatcccta 
ccttgaatac 
acctaaagct 
taatgatttt 
gtaatgtgag 
gtatgttgtg 
attacgccaa 
gtggcggccg 
tgaactcaca 
cacaatccca 
gcaatggcaa 
aatggttgct 
caaacagaga 
aacaaaatct 
ctcacctgga 
gccctcattc 
aaagggcagc 
acaaaactat 
tcccaggtaa 
agactgacat 
aaaaagcagg 
taaaaacaaa 



atcatatgcc 
acgcccagta 
tcgctactac 
actcacgggg 
aaaatcaacg 
gtaggcgtgt 
cctggagacg 
tccgcggccg 
ccgcctatag 
cttggggcct 
ggtgtgggtt 

aatccataac 
cttcagagac 
aattcacata 
gatctccacg 

agcttccaca 
cttgctccta 
gccgcacaag 
cacggctgac 
tgttgtattc 
cagtgtagtc 
gaccaacaga 
actcgatatt 
caacagctta 
acttggccgt 
actgtcaggt 
ctttatctgt 
gcaaaaacga 
ttatgagaaa 
agccgacctt 
ctttaaagtg 
aagaggaaaa 
acatgatatg 
tccaatctca 
ctcgacacgg 
gccatgggtt 
tatctattcg 
cggactaggc 
aatcgccctg 
ttgggacaag 
cttagg.catg 
ggctgcaacc 
aggggatcgc 
tttatcatca 
ttatgattga 
aaagtatatt 
tccaagaagt 
attactggta 
ttcctgacgg 
ggtaaaaatc 
ttagctcact 
tggaattgtg 
gcgcgcaatt 
ctctagaact 
tccaaaggag 
caaaacagct 
tccattcgac 
tctttccctc 
acaattaatg 
caagtcccga 
cttcatatcc 
acgagactgc 
agagccttag 
ttgtactgct 
ccttccaact 
gcatttcata 
caagattttc 
atgaaacaaa 
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acttagagac 
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ggcgctgttg 
aaaactgact 
acctcgcatg 
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cacttctggc 
tgctgtgcct 
cctggaaggt 
tctgagtagg 
ttgggaagac 
tctctctctc 
cccaattcgc 
cgtgactggg 
gccagctggc 
ctgaatggcg 
gttaaatcag 
aagaatagac 
agaacgtgga 
tccgggatca 
taggggattc 
atgatgacgt 
aaacctaaaa 
attgagcttg 
gtaaaaaatg 
tggtgacgaa 



ggtgtagaca 
cttggcaagg 
acaaaagaaa 
acagagattg 
atttttaaat 
ctattatttc 
atgattgtcc 
tgccaggcca 
aacacatctg 
agaaaaaaag 
ccataggaaa 
tgcttcttta 
tgtcctgcct 
acttctggtc 
gttttccatc 
gtaactgaag 
cctgatggat 
tcgctctcca 
aatgatttct 
tcaggctata 
tgcccccagc 
gcaagcacgg 
aacatcttct 
aaagacagca 
ggagacagta 
atcctcaacc 
tatgctgaag 
agaggaggct 
aattcctggg 
gtggattctc 
aaaacactta 
aaacctgtgc 
aaaatgaaga 
cctgatgaag 
gaatggacca 
aagatggagg 
tttagctctt 
gctgtccatg 
gaggctggag 
ttctgtatca 
cggqcagcag 
gcaccagcag 
ggctggatcg 
cacctggtgg 
acccgccggg 
gcaggcagcc 
caatgctgta 
taaagggcga 
taataaaaga 
tctagttgcc 
gccactccca 
tgtcattcta 
aacagcaggc 
tctctctctc 
cctatagtga 
aaaaccctgg 
gtaatagcga 
aatggaaatt 
ctcatccttc 
cgagataggg 
ctccaacgtc 
tatgacaaga 
atcagtgctc 
gcctaaaaac 
gagcttgccg 
aaagataaat 
ccctcttggg 
ataactaagc 

142 



tccagcaaaa 
agaatgtaga 
atggcactga 
cagtgatctc 
caaacagtgc 
aatacagaac 
ctcgaaccat 
ttaagttatt 
aaattgagta 
tttgttataa 
gaaagtgcgt 
tttctcctat 
agcatggctc 
tgttactaca 
tctaaggttc 
ctcaatggaa 
tagcagaaca 
ttcaatccaa 
atggcgtcaa 
tattccccag 
actcaagccc 
aacttcgttt 
actgccccat 
ccaggacaca 
ttgaagctca 
aaatcaccaa 
agagataccc 
tggaacctat 
tagaaagtca 
aaactgcaat 
aggatgaaga 
agatgatgta 
tcctggagct 
tctcaggcct 
gttctaatgt 
aaaaatacaa 
cagccaatct 
cagcacatgc 
tggatgctgc 
agcacatcgc 
atgacgcacc 
atgacgcacc 
gcctgctgga 
aagctcccca 
aggcagagga 
tgcagccctt 
ccagcaCctg 
attatcgcgg 
tcagagctct 
agccatctgt 
ctgtcctttc 
ttctgggggg 
atgctgggga 
tctctctctc 
gtcgtattac 
cgttacccaa 
agaggcccgc 
gtaagcgtca 
aaccaatagg 
tCgagtgttg 
aaagggcgaa 
tgtgtatcca 
agggccaacg 
ttactcaatg 
ataaaaaagg 
aaaatagata 
ttatcaagag 
acttgtctcc 



aaatattatt 
tttctacagt 
ctaaacttca 
tatgtatgtc 
tttacagagg 
aatagcttct 
gaacactcct 
catggaagat 
ttgttttgca 
agcattcaca 
ctgctcttca 
tttgtcaaga 
agatgcacgt 
accatagtaa 
ccacattttc 
catgagcaat 
ggcagaaaac 
aatggaccta 
aggtcaaact 
ggctcagcca 
aaaagacaac 
tgatgtattc 
tgccatcatg 
gataaataag 
gtgtggcaca 
accaaatgat 
aatcctgcca 
caactttcaa 
gacaaatgga 
ggttctggtc 
cacacaagca 
ccagattggt 
tccatttgcc 
tgagcagctt 
tatggaagag 
cctcacatct 
gtctggcatc 
agaaatcaat 
aagcgtctct 
aaccaacgcc 
agcagatgac 
agcagatgac 
tgacgatgac 
cctagtgtgc 
cctgcaggtg 
ggccctggag 
ctccctctac 
ccgctctaga 
agagatctgt 
tgtttgcccc 
ctaataaaat 
tggggtgggg 
tgcggtgggc 
tcggtacctc 
gcgcgctcac 
cttaatcgcc 
accgatcgcc 
atattttgtt 
ccgaaatcgg 
ttccagtttg 
aaaccgtcta 
ccttaactta 
agaattaaca 
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ccaatttatt 
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ggtcattata 
tgtttactcc 
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atatatgttt 
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gaagctaaat 
tttcgcggaa 
cctgagcttg 
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• :i581 aggggttaac atgaaggtca tcgatagcag gataataata cagtaaaacg ctaaaccaac- 
n641 aatccaaatc cagccatccc aaattggtag tgaatgatca caaataacag caaacagtaa 
8701 tgggccaata acaccggttg cattggtaag gctcaccaat aatccctgta aagcaccttg 
B701 ctgatgaccc tttgtttgga tagacatcac tccctgtaat gcaggtaaag cgatcccacc 
5 1821 accagccaat aaaattaaaa cagggaaaac taaccaacct tcagatataa acgctaaaaa 

J881 ggcaaatgca ctactatctg caataaatcc gagcagtact gccgtcttct cgcccattta 
d94l gcggccatcc ttcctgccac aaaggcttgg aataccgagt gtaaaagacc aagacccgta 
9001 atgaaaagcc aaccatcatg ctattcatca tcacgatttc tgtaatagca ccacaccgtg 
9061 ctggattggc tatcaatgcg ctgaaataat aatcaacaaa tggcatcgtt aaataagtga 
10 9121 tgtataccga tcagctttcg ttccctttag tgagggttaa ttgcgcgctt ggcgcaatca 

9181 tggtcatagc tgtttcctgt gcgaaattgt tatccgctca caattccaca caacatacga 
9241 gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt 
9301 gcgtcgcgct cactgcccgc ttcccagtcg ggaaacctgt cgtgccagct gcattaatga 
9361 atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc 
15 9421 actgacccgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca cccaaaggcg 

9481 gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgcg agcaaaaggc 
9 541 cagcaaaagg ccaggaaccg taaaaaggcc gcgttgccgg cgtttttcca taggccccgc 
9601 ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga 
9661 ctataaagat accaggcgtc tccccccgga agcccccccg tgcgctctcc cgttccgacc 
20 9721 ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgCggc gctttctcat 

9781 agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagcC gggctgtgtg 
;^841 cacgaacccc ccgttcagcc cgaccgctgc gccccatccg gtaactatcg tcttgagtcc 
.•^1901 aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga 
.S961 gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact 
25 10021 agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt 

10081 ggtagctctc gatccggcaa acaaaccacc gctggtagcg gtggtttttc tgtctgcaag 
10141 cagcagatta cgcgcagaaa aaaaggatct caagaagacc cttcgatctt ttctacgggg 
1*^201 tctgacgctc agtggaacga aaactcacgt taagggatct tggtcatgag attatcaaaa 
1'''261 aggatctcca cccagatcct tttaaattaa aaatgaagtt ctaaatcaat ctaaagtata 
30 10321 tatgagtaaa cttggcctga cagttaccaa tgcttaatca gtgaggcacc catctcagcg 

10381 atctgcccat ctcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata 
10441 cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgctcaccg 
10501 gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagcggtcct 
10561 gcaacttcat ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt 
35 10621 tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc 

10681 tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga 
10741 tcccccatgt tgtgcaaaaa agcggttagc tcctccggtc ctccgatcgt tgtcagaagt 
10801 aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc 
10861 atgccatccg taagatgctt Ctctgtgact ggtgagtact caaccaagtc actctgagaa 
40 . 10921 tagtgtatgc ggcgaccgag ttgctcctgc ccggcgtcaa tacgggataa taccgcgcca 

10981 catagcagaa ctttaaaagt gctcatcact ggaaaacgtt cttcggggcg aaaactctca 
11041 aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct 
11101 tcagcatctt tcacttccac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc 
i:.161 gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa 
45 l.,22l tattattgaa gcacttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt 

1^281 tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc ac 
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